Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

translating Unicode characters from input

I have a string with unicode characters that I need to decode. When I hardcode the string into python it seems to work. However, if I get it through an input, it doesn’t translate. For example,

input_0 = input() #f\u00eate
print(input_0) # prints f\u00eate
word = "f\u00eate"
print(word) # prints fête

How could I turn the Unicode parts of the string from the input into regular characters? I have tried using str(word) too.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

What you get from input() is a raw-string which means you don’t have escape sequence they are literal characters. \u00ea is 6 characters.

You should encode it with "raw-unicode-escape" and then decode it with "unicode-escape":

input_0 = input()  # f\u00eate
print(input_0.encode("raw-unicode-escape").decode("unicode-escape"))

Explanation for these two encodings: https://docs.python.org/3/library/codecs.html#text-encodings

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading