Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Reading strings with special characters in Python

I have a string with special characters as follows

req_str = 'N\x08NA\x08AM\x08ME'
## If I print it I correctly get the word "NAME"
print(req_str)

>>> print(req_str)
NAME

Now I want to extract the string NAME from the string.
I tried

''.join(c for c in 'N\x08NA\x08AM\x08ME' if c.isprintable())
## this produces
'NNAAMME'

I understand this has got to do with some special encoding. I am not very familiar with string encodings. My question is how can I extract the word ‘NAME` as a string in this situation ?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

According to the ASCII table, \x08 is for backspace character. It can also be produced by \b:

req_str1 = "N\x08NA\x08AM\x08ME"
req_str2 = "N\bNA\bAM\bME"
print(req_str1)
print(req_str2)
print(req_str1 == req_str2)

output:

NAME
NAME
True

Basically it writes a N and then applies backspace then writes another N. That’s why you see one N in the final output. Same thing for A, M and E.

To extract NAME you can remove the backspace character with either its before or after character. Both will give you a clean NAME.

import re

req_str = "N\x08NA\x08AM\x08ME"
print(re.sub(r".\x08", "", req_str))
print(re.sub(r"\x08.", "", req_str))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading