Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Generator comprehension with open function

I’m trying to figure out what is the best of using generator when parsing a file line by line.
Which use of the generator comprehension will be better.

First option.

with open('some_file') as file:
    lines = (line for line in file)

Second option.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

lines = (line for line in open('some_file'))

I know it will produce same results, but which one will be faster/ more efficient?

>Solution :

You can’t combine generators and context managers (with statements).

Generators are lazy. They will not actually read their source data until something requests an item from them.

This appears to work:

with open('some_file') as file:
    lines = (line for line in file)

but when you actually try to read a line later in your program

for line in lines:
    print(line)

it will fail with ValueError: I/O operation on closed file.

This is because the context manager has already closed the file – that’s it’s sole purpose in life – and the generator has not started reading it until the for loop started to actually request data.

Your second suggestion

lines = (line for line in open('some_file'))

suffers from the opposite problem. You open() the file, but unless you manually close() it (and you can’t because you don’t know the file handle), it will stay open forever. That’s the very situation that context managers fix.

Overall, if you want to read the file, you can either … read the file:

with open('some_file') as file:
    lines = list(file)

or you can use a real generator:

def lazy_reader(*args, **kwargs):
    with open(*args, **kwargs) as file:
        yield from file

and then you can do

for line in lazy_reader('some_file', encoding="utf8"):
    print(line)

and lazy_reader() will close the file when the last line was read.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading