Convert "weird" strings to normal python strings

Advertisements

Context: I’m trying to convert characters like these:

𝐁𝐔𝐈𝐋𝐃𝐈𝐍𝐆
𝙎𝙥𝙚𝙚𝙙𝙮
𝕋𝕌𝔼𝕊𝔻𝔸𝕐
𝕤𝕡𝕒𝕘𝕙𝕖𝕥𝕥𝕚

To normal python strings (speedy, building, tuesday, etc) and save them into a new dataframe to be exported into a new excel file. For example, the charcter 𝕒 (U+1D552) should be converted to a (U+00AA). I’m reading each string from an excel file using read_excel. Should I do some type of encoding = "utf-8"? on the read_excel function? Or is there a way using re to replace those characters? Or even encode("ascii").decode(utf-8)?

Thank you in advance

>Solution :

Using unicodedata you can normalize unicode strings:

>> from unicodedata import normalize
>> test_str = "𝐁𝐔𝐈𝐋𝐃𝐈𝐍𝐆 𝙎𝙥𝙚𝙚𝙙𝙮 𝕋𝕌𝔼𝕊𝔻𝔸𝕐 𝕤𝕡𝕒𝕘𝕙𝕖𝕥𝕥𝕚"
>> print(normalize('NFKC', test_str))
BUILDING Speedy TUESDAY spaghetti

Dev solutions

Solutions for development problems

Convert "weird" strings to normal python strings

>Solution :

Leave a ReplyCancel reply

>Solution :

Share this:

Leave a ReplyCancel reply

Discover more from Dev solutions