Home Remove duplicates of list based on condition

Questions

Remove duplicates of list based on condition

February 15, 2022

Say, I have the following two lists:

list1 = ['A', 'A', 'B', 'B', 'C', 'D']
list2 = ['x', 'y', 'y', 'x', 'x', 'y']

I want to eliminate all duplicates of list1 and their corresponding elements in list2 based on the condition that the corresponding element of the duplicate in list2 is ‘y’.

Expected outcome:

list1 = ['A', 'B', 'C', 'D']
list2 = ['y', 'y', 'x', 'y']

The final goal in the end to continue doing stuff based on the returned indices, for the example that would be for the example above:

index = [1, 2, 4, 5]

I tried solving this by using pandas

df = pd.DataFrame(zip(list1, list2), columns=["l1", "l2"])
df = df[(~(df.duplicated(['l1']))) | (df.duplicated(['l1']) & df.l2.eq('y'))]

But this does not give me the correct result. Please note that I cannot refer to first or last element dropping, as ‘x’ and ‘y’ do not need to appear in the same order.

A solution with pandas would be fine, but is not necessary, a solution with list comprehension would be also fine…

>Solution :

You could use:

# keep if: l1 is not duplicated     OR  l2 == "y"
df[~df['l1'].duplicated(keep=False) | df['l2'].eq('y')]

output:

  l1 l2
1  A  y
2  B  y
4  C  x
5  D  y

byMR

Published February 15, 2022

Add a comment

Is there a way to write a for each in python3 that modifies the array?

byMR

February 15, 2022

Questions

How to sort a list alphabetically (R studio)

byMR

February 15, 2022

Questions

Replace multiple pandas columns with constants from dictionary

byMR

February 15, 2022

Questions

Unable to get JSON data from Nominatim's API

byMR

February 15, 2022

Questions

PowerShell filtering twice on the same property

byMR

February 15, 2022

Questions

What is the difference between train, validation and out of time validation data?

byMR

February 15, 2022

Remove duplicates of list based on condition

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Is there a way to write a for each in python3 that modifies the array?

How to sort a list alphabetically (R studio)

Replace multiple pandas columns with constants from dictionary

Unable to get JSON data from Nominatim's API

PowerShell filtering twice on the same property

What is the difference between train, validation and out of time validation data?

Keep Up to Date with the Most Important News

Remove duplicates of list based on condition

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Is there a way to write a for each in python3 that modifies the array?

How to sort a list alphabetically (R studio)

Replace multiple pandas columns with constants from dictionary

Unable to get JSON data from Nominatim's API

PowerShell filtering twice on the same property

What is the difference between train, validation and out of time validation data?

Discover more from Dev solutions