I have a string with special characters as follows
req_str = 'N\x08NA\x08AM\x08ME'
## If I print it I correctly get the word "NAME"
print(req_str)
>>> print(req_str)
NAME
Now I want to extract the string NAME from the string.
I tried
''.join(c for c in 'N\x08NA\x08AM\x08ME' if c.isprintable())
## this produces
'NNAAMME'
I understand this has got to do with some special encoding. I am not very familiar with string encodings. My question is how can I extract the word ‘NAME` as a string in this situation ?
>Solution :
According to the ASCII table, \x08 is for backspace character. It can also be produced by \b:
req_str1 = "N\x08NA\x08AM\x08ME"
req_str2 = "N\bNA\bAM\bME"
print(req_str1)
print(req_str2)
print(req_str1 == req_str2)
output:
NAME
NAME
True
Basically it writes a N and then applies backspace then writes another N. That’s why you see one N in the final output. Same thing for A, M and E.
To extract NAME you can remove the backspace character with either its before or after character. Both will give you a clean NAME.
import re
req_str = "N\x08NA\x08AM\x08ME"
print(re.sub(r".\x08", "", req_str))
print(re.sub(r"\x08.", "", req_str))