Follow

Follow

Contact

Home How to convert utf-8 characters to "normal" characters in string in python3.10?

Questions

How to convert utf-8 characters to "normal" characters in string in python3.10?

byMR

July 9, 2022

I have raw data that looks like this:

25023,Zwerg+M%C3%BCtze,0,1,986,3780
25871,red+earth,0,1,38,8349
25931,K4m%21k4z3,90,1,1539,2530

It is saved as a .txt file: https://de205.die-staemme.de/map/player.txt

The "characters" starting with % are unicode, as far as I can tell.

I found the following table about it: https://www.i18nqa.com/debug/utf8-debug.html

Here is my code so far:

urllib.urlretrieve(url,pfad + "player.txt")

f = open(pfad + "player.txt","r",encoding="utf-8")
raw = raw.split("\n")
f.close()

Python does not convert the %-characters. They are read as if they were seperate characters.

Is there a way to convert these characters without calling .replace like 200 times?

Thank you very much in advance for help and/or useful hints!

>Solution :

The %s are URL-encoding; use urllib.parse.unquote to decode the string.

>>> raw = """25023,Zwerg+M%C3%BCtze,0,1,986,3780
... 25871,red+earth,0,1,38,8349
... 25931,K4m%21k4z3,90,1,1539,2530"""
>>> import urllib.parse
>>> print(urllib.parse.unquote(raw))
25023,Zwerg+Mütze,0,1,986,3780
25871,red+earth,0,1,38,8349
25931,K4m!k4z3,90,1,1539,2530

utf-8

byMR

Published July 09, 2022

Add a comment

Leave a ReplyCancel reply

Read more

Questions

json pointers not resolved in python module

byMR

July 9, 2022

Questions

Problem in positioning in navigation bar using css

byMR

July 9, 2022

Questions

escape % Wildcard in prepared statement

byMR

July 9, 2022

Questions

Change array elements

byMR

July 10, 2022

Questions

what layout to learn in CSS flexbox or grid?

byMR

July 10, 2022

Questions

How do I append a div container to a specific area of a webpage?

byMR

July 10, 2022