Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python reading text file every second line skipped

I am processing a shell script in Python. My first step is to comb through the file and save only the important lines in a list (of strings). However, I have isolated a problem where every second line is ignored. Why is the second, fourth, etc. line skipped in the following code?

f = open("sample_failing_file.txt", encoding="ISO-8859-1")
readfile = f.read()
filelines = readfile.split("\n")

def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
    for line in filecontent:
        if drop_line_appropriate(line):
            filecontent.remove(line)
    return filecontent

def drop_line_appropriate(line: str) -> bool:
    if line.startswith("#"):
        return True
    # some more conditions, omitted here
    return False

filelines = remove_irrelevant_lines(filelines)
f.close()

When I run this code, I can see filecontent is complete. However, when I look at line, I can see e.g. some line 3 is never read. Here is a simplified version of the shell script, on which my Python script fails (sample_failing_file.txt)

#!/bin/sh
#
# some line 1
#
# some line 2
# some line 3

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

As was pointed out in the comments, you shouldn’t try to remove elements from a list while iterating over it. Additionally, when removing lines, don’t want to use list.remove(), since that causes it to search for the line, which will make it run vastly slower than it should.

The following should fix your problem and also run vastly faster:

def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
    return [line for line in filecontent if not drop_line_appropriate(line)]

This creates and returns a new list, filtering out the lines indicated by drop_line_appropriate.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading