I have two lists:
- old_data has a length of 180,000.
- updated_data has a length of 184,000.
- Both lists contain IDs which are string values.
- There is a lot of overlap in IDs stored within the lists but I need to know which ones are no longer stored in updated_data so that I can say those IDs are not longer active in an SQL server.
Therefore, I need to check for any items in old_data that are not in updated_data and save them to a separate list, let’s call it inactive_data.
I have the following code currently which is highly time-consuming and inefficient:
# Initialize list for no-longer active IDs to be saved into. inactive_data =  # Iteratively check if all IDs in old data are still in updated data. # If they are not, add them to the list. for Id in old_data: if Id not in updated_data: inactive_data.append(Id)
To run this code, it took approx. 45 minutes. I would to know how I could drastically reduce the time. If you have any suggestions please let me know!
The usual and easy way is to turn one or both of the
list into a
An alternate way is to sort both lists and compare them element by element. Once they’re ordered, the comparison is quick because it’s O(n).