Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Checking for a substring in unicode value

Suppose I have a variable that has a unicode value in a Python script.

 place_name = u'K\u016bla Mountain'

In this instance, 016b denotes that a macron accent mark is used over the u. I want to check for ‘016b’ in the substring and if found, change place_name to u'Kula Mountain'. If it was just a string, I could use:

if '016b' in place_name:
    place_name = 'Kula Mountain'

But that won’t work with the unicode value. Whats the simplest way to check for ‘016b’ and if found, change place_name to uncode value of u'Kula Mountain'?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Note, I tried:

 if '016b' in ord(alt_map_name):
      place_name = u'Kula Mountain'

as suggested by other posts on this issue, but got

Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: ord() expected a character, but string of length 16 found

EDIT: To be clear, I just want to check for the macron (0x016b), be it with a ‘u’ or any other letter.

>Solution :

place_name = u'K\u016bla Mountain'


if 0x016b in [ord(c) for c in place_name]:
    place_name = u'Kula Mountain'
print(place_name)

output:-

Kula Mountain

In your case, 0x016b represents the Unicode code point for the character 'u' and ord() take single character as an argument.so, you can use list comprehension in this

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading