Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove duplicates of list based on condition

Say, I have the following two lists:

list1 = ['A', 'A', 'B', 'B', 'C', 'D']
list2 = ['x', 'y', 'y', 'x', 'x', 'y']

I want to eliminate all duplicates of list1 and their corresponding elements in list2 based on the condition that the corresponding element of the duplicate in list2 is ‘y’.

Expected outcome:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

list1 = ['A', 'B', 'C', 'D']
list2 = ['y', 'y', 'x', 'y']

The final goal in the end to continue doing stuff based on the returned indices, for the example that would be for the example above:

index = [1, 2, 4, 5]

I tried solving this by using pandas

df = pd.DataFrame(zip(list1, list2), columns=["l1", "l2"])
df = df[(~(df.duplicated(['l1']))) | (df.duplicated(['l1']) & df.l2.eq('y'))]

But this does not give me the correct result. Please note that I cannot refer to first or last element dropping, as ‘x’ and ‘y’ do not need to appear in the same order.

A solution with pandas would be fine, but is not necessary, a solution with list comprehension would be also fine…

>Solution :

You could use:

# keep if: l1 is not duplicated     OR  l2 == "y"
df[~df['l1'].duplicated(keep=False) | df['l2'].eq('y')]

output:

  l1 l2
1  A  y
2  B  y
4  C  x
5  D  y
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading