Home Is there a more efficient method than nested loops to search through a list and then find corresponding matches in a text file?

Questions

Is there a more efficient method than nested loops to search through a list and then find corresponding matches in a text file?

byMR

June 30, 2022

I have a list of values and I need to search through an ~1 GB text file for those values and then pull a corresponding value from the text file in the same row. Is there a more efficient method than using nested for loops where I would be going through my list of values and then going into my nested loop where I would be iterating through the text file to find the corresponding matches?

    for name in name_list:
        with open("test.txt") as infile:
            for line in infile:
                currentline = line.split("|")
                if name == currentline[0]:
                    print(currentline[2])

>Solution :

This is exactly what a dictionary is for. Assuming parts[0] is unique:

dic = {}
with open("test.txt") as infile:
    for line in infile:
        parts = line.split("|")
        dic[parts[0]] = parts[2]
for name in name_list:
    if name in dic:
        print(dic[name])

if it isn’t unique, you’ll have to keep a list for each dict entry:

dic = {}
with open("test.txt") as infile:
    for line in infile:
        parts = line.split("|")
        if parts[0] in dic:
            dic[parts[0]].append(parts[2])
        else:
            dic[parts[0]] = [parts[2]]
for name in name_list:
    if name in dic:
        for matching in dic[name]:
            print(matching)

this drastically reduces the runtime: assuming your file has n entries and your name_list has m entries, you had a complexity of O(n * m) before – now you have O(m), since hash map access is constant time – you don’t perform a linear search anymore!

In practice, the speedup will be even larger since your previous reread the file within the O(m) loop.