Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Ambiguous output with python and files

I was testing how filestreams work in python and I wrote some code as follows:

with open('test2.txt') as f:
    while r := f.read(1):
        print(repr(r), f.tell(), sep='\tindex:', end='\n***************\n')

Contents of test2.txt are as follows:

012345

6789

I ran the code and the output is as follows:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

'0'     index:1
***************
'1'     index:2
***************
'2'     index:3
***************
'3'     index:4
***************
'4'     index:5
***************
'5'     index:18446744073709551623
***************
'\n'    index:8
***************
'\n'    index:10
***************
'6'     index:11
***************
'7'     index:12
***************
'8'     index:13
***************
'9'     index:14
***************

Someone please help me understand as to why f.tell() returns 18446744073709551623 and also why ‘\n’ has index 8 instead of 7 if we assume ‘5’ to get index 6. Thank you in advance.

>Solution :

The Python documentation mentions that the file.tell() method returns an undefined value when called after file.read() or file.readline() has been called. This is specifically mentioned under the section detailing the file.tell() method.

In your code snippet, the call to f.tell() occurs immediately after the method f.read(1). Thus, the return value is undefined. To get expected results, avoid calling f.tell() right after f.read() or f.readlines().

The second point regarding the newline character having an index of 8 instead of 7 is because newline character (\n) is considered a single character in Python, and thus it occupies one place in the file. It’s essentially an invisible character that signifies a line break. So after 5, the newline character ‘\n’ is at index 6. Then f.tell() points to the start of the next character, which is another newline character in your file, thus the index 8.

This is consistent with the operation of file streams where the index is the position where the next write would happen, which is right after the last read character. The seemingly "skipped" index of 7 is due to the newline character ‘\n’.

Regarding the unexpected large index of 18446744073709551623, it’s probably due to the issue I mentioned earlier – using tell() after read(). It might be a bug or a system specific issue. It’d be best to not rely on tell() right after a read().

See also

Python file.tell() giving strange numbers?

Python file.tell gives wrong value location

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading