Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas – List of Duplicates but Only Maintain Rows with Same Value Across All Duplicates

Working on some data cleaning here.

I’m having a tough time parsing out duplicated rows where the value in column B is the same across all duplicated instances for column A. I want to keep the duplicates, and then I will drop the duplicates in a future step so I end up with one row per for each combo of col A/col B. For now, I just want the rows that I’m interested in.

Example dataframe:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

A     B
1     1
1     1
1     2
1     2
2     5
2     5
2     5

In the example above, I only want to maintain the rows with 2 and 5 because the value of 5 is the same value across all instances of 2. When you look at the instance of 1, it technically has duplicates, but it has 2 different values across all instances of 1 (1 and 2).

Example of what I’m trying to achieve:

A     B
2     5
2     5
2     5

Since I would also drop the duplicates after I get to that point, this is what I’d ideally end up with in case that changes the recommended solution:

A     B
2     5

Here is the code I’m using which at least brought me to this instance:
updates_dupes = updates_df[updates_df.duplicated(["A", "B"])]

Any thoughts?

>Solution :

new_df = df.groupby('A').filter(lambda group: group['B'].nunique() == 1).drop_duplicates()

Output:

>>> df
   A  B
0  1  1
1  1  1
2  1  2
3  1  2
4  2  5
5  2  5
6  2  5

>>> df.groupby('A').filter(lambda group: group['B'].nunique() == 1)
   A  B
4  2  5
5  2  5
6  2  5

>>> df.groupby('A').filter(lambda group: group['B'].nunique() == 1).drop_duplicates()
   A  B
4  2  5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading