I need to check two csv files containing coordinates, line by line, if any point from one file is close (less than 10 meters) in the second file, it returns True, unfortunately in the attached code, the second loop, I don’t know why, is executed only once. Besides, maybe there is a faster way?
Below is my code:
def comparison_closeness_two_files(filename1, filename2):
threshold_meters = 10/1000 # 10 meters
file1 = open(filename1, 'r')
file1 = csv.reader(file1, delimiter=',')
file2 = open(filename2, 'r')
file2 = csv.reader(file2, delimiter=',')
for index, fc1 in enumerate(file1):
if (len(fc1) != 4):
continue
lat1 = float(fc1[2])
lon1 = float(fc1[3])
for index, fc2 in enumerate(file2):
if (len(fc2) != 4):
continue
lat2 = float(fc2[2])
lon2 = float(fc2[3])
distance = get_distance_between(lat1, lon1, lat2, lon2)
if (distance <= threshold_meters):
return True
return False
>Solution :
your problem is linked to the fact that you work with file streams and not array. See them as generators. Once you iterated over it once, there is nothing left, the stream has been consumed.
To get over this, preload your data into separate lists (at least for the second file, which with you iterate a lot over):
Here is a solution that pre-load both files in arrays:
def comparison_closeness_two_files(filename1, filename2):
threshold_meters = 10/1000 # 10 meters
with open(filename1, 'r') as f1, open(filename2, 'r') as f2:
file1 = list(csv.reader(f1, delimiter=','))
file2 = list(csv.reader(f2, delimiter=','))
for fc1 in file1:
if len(fc1) != 4:
continue
lat1 = float(fc1[2])
lon1 = float(fc1[3])
for fc2 in file2:
if len(fc2) != 4:
continue
lat2 = float(fc2[2])
lon2 = float(fc2[3])
distance = get_distance_between(lat1, lon1, lat2, lon2)
if distance <= threshold_meters:
return True
return False
As mentioned before, you might want to just pre load the second file into an array, for memory usage concerns.