Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Converting characters like '³' to Integer in Python

I have this character ‘³’ in my dataset that I’m processing on top of.

Generic Idea is to detect if a character is an integer, convert it into an integer and process on top of it.

>>> x = '³'
>>> x.isdigit() # Returns True
True

Python detects this character as a digit. But raises the following error when I try to convert it

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>>> int(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '³'

I would like it if such characters could also be converted to integer, to ease my further processing

Not sure if this helps, but here is my locale info

>>> import locale
>>> locale.getdefaultlocale()
('en_US', 'UTF-8')

>Solution :

You can use unicodedata and NFKC to convert it
here is a detailed code with some error handling

import unicodedata

x = '³'
try:
    regular_digit = unicodedata.normalize('NFKC', x)
    integer_value = int(regular_digit)
    print(integer_value)
except ValueError:
    print(f"'{x}' is not a convertible superscript digit.")
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading