Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Move even lines to last of odd lines except last line

I have some txt files in E:\Desktop\prog\OCR directory that each file have a format like following:

Fytytyotyrtyttyran
57.338
CtyOtyBtyOtyL
13.318
AytLtGtyOtyL
10.254
Ayttssemtybtyly
5.33
BtyAtySItyC
2.061
AytryPL
1.53
Lirtysyrtyp
1.466
Ctry
0
Patretsyttrcal
0
1965 Q2

Now i want to convert above list to following format:

Fytytyotyrtyttyran;57.338
CtyOtyBtyOtyL;13.318
AytLtGtyOtyL;10.254
Ayttssemtybtyly;5.33
BtyAtySItyC;2.061
AytryPL;1.53
Lirtysyrtyp;1.466
Ctry;0
Patretsyttrcal;0
1965 Q2

note that last line of each file no need any change.
I wrote following python script for this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import os

input_directory = r'E:\Desktop\prog\OCR'
output_directory = r'E:\Desktop\prog\OCR\output'

def merge_even_odd_lines(input_path, output_path):
    with open(input_path, 'r', encoding='utf-8') as infile:
        lines = infile.readlines()

    merged_lines = []
    for i in range(0, len(lines), 2):
        if i + 1 < len(lines):
            odd_line = lines[i].strip()
            even_line = lines[i + 1].strip()
            merged_lines.append(f"{odd_line};{even_line}")
        else:
            merged_lines.append(lines[i].strip())

    with open(output_path, 'w', encoding='utf-8') as outfile:
        outfile.write('\n'.join(merged_lines))

def process_files(directory_path):
    if not os.path.exists(output_directory):
        os.makedirs(output_directory)

    for root, _, files in os.walk(directory_path):
        for file in files:
            if file.endswith('.txt'):
                input_file_path = os.path.join(root, file)
                output_file_path = os.path.join(output_directory, file)
                merge_even_odd_lines(input_file_path, output_file_path)

if __name__ == "__main__":
    process_files(input_directory)
    print("Conversion completed successfully.")

But my script convert my files to following format:

Fytytyotyrtyttyran;57.338;CtyOtyBtyOtyL;13.318
AytLtGtyOtyL;10.254;Ayttssemtybtyly;5.33
BtyAtySItyC;2.061;AytryPL;1.53
Lirtysyrtyp;1.466;Ctry;0
Patretsyttrcal;0;1965 Q2

where is my script problem?

>Solution :

The problem is that you’re processing the output files as input files, because the output directory is a subdirectory of the input directory, and os.path.walk() goes into subdirectories. So each file gets merged twice.

If you don’t need to process the directory hierarchy recursively, don’t use os.path.walk(), just loop over the files in input_directory:

for file in glob.glob(os.path.join(input_directory, "*.txt")):

If you do need to recurse, the simplest solution is to move the output directory out of the input directory. Another choice is to check whether root is in the output directory and skip those files.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading