Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert bytes object to string object in python

python code

#!python3

import sys
import os.path
import codecs

if not os.path.exists(sys.argv[1]):
    print("File does not exist: " + sys.argv[1])
    sys.exit(1)
file_name = sys.argv[1]

with codecs.open(file_name, 'rb', errors='ignore') as file:
    file_contents = file.readlines()

for line_content in file_contents:
    print(type(line_content))
    line_content = codecs.decode(line_content)
    print(line_content)
    print(type(line_content))

File content : Log.txt

b'\x03\x00\x00\x00\xc3\x8a\xc3\xacRb\x00\x00\x00\x00042284899:ATBADSFASF:DSF456582:US\r\n1'

Output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

python3 file_convert.py Log.txt                                                                                                                                               ✔  19:08:22 
<class 'bytes'>
b'\x03\x00\x00\x00\xc3\x8a\xc3\xacRb\x00\x00\x00\x00042284899:ATBADSFASF:DSF456582:US\r\n1'
<class 'str'>

I tried all the below methods

line_content = line_content.decode('UTF-8')
line_content = line_content.decode()
line_content = codecs.decode(line_content, 'UTF-8')

Is there any other way to handle this?
The line_content variable still holds the byte data and only the type changes to str which is kind off confusing.

>Solution :

The data is Log.txt is the string representation of a python Bytes object. That is odd but we can deal with it. Since its a Bytes literal, evaluate it, which converts it to a real python Bytes object. Now there is still a question of what its encoding is.

I don’t see any advantage to using codecs.open. That’s a way to read unicode files in python 2.7, not usually needed in python 3. Guessing UTF-8, your code would be

#!python3

import sys
import os
import ast

if not os.path.exists(sys.argv[1]):
    print("File does not exist: " + sys.argv[1])
    sys.exit(1)
file_name = sys.argv[1]

with open(file_name) as file:
    file_contents = file.readlines()

for line_content in file_contents:
    print(type(line_content))
    line_content = ast.literal_eval(line_content).decode("utf-8")
    print(line_content)
    print(type(line_content))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading