Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Generator always returning same value

I have a function that reads a file line by line and returns it as a list of the words. Since the file is very large, i would like to make it a generator.

Here is the function:

def tokenize_each_line(file):
   with open(file, 'r') as f:
      for line in f:
         yield line.split()

However, everytime i call next(tokenize_each_line()), it always returns the first line of the file. I guess this is not the expected behavior for generators. Instead, i’d like the function to return the next line.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Calling the function tokenize_each_line() returns a newly-initialized generator. So next(tokenize_each_line()) initializes a generator and makes it yield its first item (the first line of the file).

Instead, initialize the generator, hold a reference to it, and call next on it according to your requirements.

For example:

gen = tokenize_each_line('myfile.txt')

# just as an example of how you might want to use the generator
words = []
while len(words) < 1000:
    words += next(gen)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading