I have 2 text files named f1 & f2 with 100k lines of names each. I want to compare the first line of f1 with every line of f2, then the second line of f1 with every line of f2, and so on. I already tried using nested for loop like code below but it doesn’t work.
What am I doing wrong I can’t seem to find? Please can someone tell me?
Thanks in advance.
old.txt
sourcreameggnest
saturnnixgreentea
saxophonedesertham
footballplumvirgo
soybeansthesting
cauliflowertornado
sourcreameggnest
saturnnixgreentea
new.txt
goldfishpebbleduck
saxophonedesertham
footballplumvirgo
abloomtheavengers
venisonflowersea
goodfellaswalker
saturnnixgreentea
Code:
with open('old.txt', 'r') as f1, open('new.txt', 'r') as f2:
for line1 in f1:
print('Line 1:- ' + line1, end='')
for line2 in f2:
print('Line 2:- ' + line2, end='')
if line1.strip() == line2:
print("Inside comparison" + line1, end='')
Output:
Line 1:- goldfishpebbleduck
Line 2:- sourcreameggnest
Line 2:- saturnnixgreentea
Line 2:- saxophonedesertham
Line 2:- footballplumvirgo
Line 2:- soybeansthesting
Line 2:- cauliflowertornado
Line 2:- sourcreameggnest
Line 2:- saturnnixgreentea
Line 1:- saxophonedesertham
Line 1:- footballplumvirgo
Line 1:- abloomtheavengers
Line 1:- venisonflowersea
Line 1:- goodfellaswalker
Line 1:- saturnnixgreentea
>Solution :
Combining the answers of @LukasNeugebauer and @Thierry Lathuille, here’s what your code should look like:
with open('old.txt', 'r') as f1, open('new.txt', 'r') as f2:
lines1 = f1.readlines()
lines2 = f2.readlines()
for line1 in lines1:
print('Line 1:- ' + line1, end='')
if line1 in lines2:
print("Inside comparison" + line1, end='')
If you are wondering, whether using in check is faster then iterating through the second list and comparing each value with ==, I tested it. For both files containing 10,000 lines of random strings, it took ~2.8 seconds to process them fully with two loops and only ~0.8 using the in operator.
If your files are not bigger than a megabyte, I wouldn’t really bother optimizing this, but otherwise you should really think about what you are actually comparing and what shortcuts can you use.