Understanding unicode in python

Advertisements I have script like that: #!/usr/bin/python3 # -*- coding: utf-8 -*- import re var1 = ‘1F1EB 1F1F7′ var1 = re.sub(r’ ‘, r’\\U000′, var1) var1 = r’\U000’ + var1 var2 = ‘\U0001F1EB\U0001F1F7’ if var1 == var2: print(‘true’) print(type(var1)) print(var1) print(type(var2)) print(var2) Output: <class ‘str’> \U0001F1EB\U0001F1F7 <class ‘str’> 🇫🇷 Input variable is var1, but I need… Read More Understanding unicode in python

Why is this python regular expression not ignoring accents?

Advertisements I am using the following regular expression for a filter of an application that connects to a MongoDB database: {"$regex": re.compile(r’\b’ + re.escape(value) + r’\b’, re.IGNORECASE | re.UNICODE)} The regular expression meets my search criteria however I have a problem and that is that it does not ignore accents. For example: The database entry… Read More Why is this python regular expression not ignoring accents?

Python – Issues with Unicode String from API Call

Advertisements I’m using Python to call an API that returns the last name of some soccer players. One of the players has a "ć" in his name. When I call the endpoint, the name prints out with the unicode attached to it: >>> last_name = (json.dumps(response["response"][2]["player"]["lastname"])) >>> print(last_name) "Mitrovi\u0107" >>> print(type(last_name)) <class ‘str’> If I… Read More Python – Issues with Unicode String from API Call

How do I delete the unicode decimal codes from a string in python

Advertisements So I have this string 1993 &#8211; Liam Payne, English singer-songwriter&#91;17&#93; that you can see contains the characters &# followed by a number. How can I delete them automatically from my string, like you can do with \u1321 type chars for example? I tried using .encode("ascii", "ignore") and .decode() but with no success. Thanks… Read More How do I delete the unicode decimal codes from a string in python

How to call the Win32 GetCurrentDirectory function from C#?

Advertisements The prototype of GetCurrentDirectory DWORD GetCurrentDirectory( [in] DWORD nBufferLength, [out] LPTSTR lpBuffer ); DWORD is unsigned long, LPTSTR is a pointer to wchar buffer in Unicode environment. It can be called from C++ #define MAX_BUFFER_LENGTH 256 int main() { TCHAR buffer[MAX_BUFFER_LENGTH]; GetCurrentDirectory(MAX_BUFFER_LENGTH, buffer); return 0; } I tried to encapsulate this win32 function in… Read More How to call the Win32 GetCurrentDirectory function from C#?

Convert "weird" strings to normal python strings

Advertisements Context: I’m trying to convert characters like these: 𝐁𝐔𝐈𝐋𝐃𝐈𝐍𝐆 𝙎𝙥𝙚𝙚𝙙𝙮 𝕋𝕌𝔼𝕊𝔻𝔸𝕐 𝕤𝕡𝕒𝕘𝕙𝕖𝕥𝕥𝕚 To normal python strings (speedy, building, tuesday, etc) and save them into a new dataframe to be exported into a new excel file. For example, the charcter 𝕒 (U+1D552) should be converted to a (U+00AA). I’m reading each string from an excel… Read More Convert "weird" strings to normal python strings