Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split file into smaller files of different sizes

Lets say I have a mixed format CSV file:

abc def
ghi
1 2
3
6 5
5 6
abc def
ghi
4 5
3
6 8
abc def
ghi
4 9
1
7 8

I want to split the big file every time "abc def" occurs and make a new smaller file every time. Is there a python, elegant way to do this?

I have tried to do for loops but I need to declare multiple different variables for each for loop iteration…

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Probably a single loop is enough to split the large file into smaller ones.
Note that in the proposed code the files are opened only when a line containing abc def is found. In case the first line of the file is not like your splitter line you may want to start by opening a new file like output_file = open(f"output_file_0.csv", "w").

LARGE_FILE = "large_file.txt"
SPLITTER_LINE = "abc def"

nth_output_file = 1
output_file = None

with open(LARGE_FILE) as large_file:
    for line in large_file:
        if line.strip() == SPLITTER_LINE:
            if output_file:
                output_file.close()
            output_file = open(f"output_file_{nth_output_file}.csv", "w")
            nth_output_file += 1
        if output_file:
            output_file.write(line)

if output_file:
    output_file.close()

Assuming that large_file.txt contains:

abc def
ghi
1 2
3
6 5
5 6
abc def
ghi
4 5
3
6 8
abc def
ghi
4 9
1
7 8

Hope that helps

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading