Home Pandas Dataframe – Finding Duplicates of One Column But Different in Another Column

Questions

Pandas Dataframe – Finding Duplicates of One Column But Different in Another Column

May 17, 2023

I have a Pandas dataframe, for example, like this:

idx	A	B
0	a1	b1
1	a2	b1
2	a2	b2
3	a2	b1
4	a3	b3
5	a3	b3
6	a4	b1

I want to find the duplicated values in Column A, but different values in Column B, and select all the indexes.

In above example, the results should be:

idx	A	B
1	a2	b1
2	a2	b2
3	a2	b1

Drop idx 0 and 6, the values in Column A are unique.
Drop idx 4 and 5, because the values in Column B are the same.
I want to keep both idx 1 and 3 in the results, although they are the same, but they have a different value in idx 2 (not all the same).

How can I achieve this goal?

>Solution :

You can use two groupby.transform for boolean indexing:

g = df.groupby('A')['B']

# is A duplicated and are the duplicates non-unique?
out = df[g.transform('count').gt(1) & g.transform('nunique').gt(1)]

# the non-unique condition is however implying the duplication of A
# we can simplify to:
out = df[df.groupby('A')['B'].transform('nunique').gt(1)]

Or, with isin:

s = df.groupby('A')['B'].nunique()

out = df[df['A'].isin(s[s>1].index)]

Output:

   idx   A   B
1    1  a2  b1
2    2  a2  b2
3    3  a2  b1

dataframe

byMR

Published May 17, 2023

Add a comment

Need to save data from an HTML file but bowser gives an undefined error

byMR

May 17, 2023

Questions

entity framework: retrieve Id of new entry upon creation

byMR

May 17, 2023

Questions

How can I use a value extracted from a data table to specify columns to subset in R?

byMR

May 17, 2023

Questions

Why objectmapper writevalueasstring return empty string?

byMR

May 17, 2023

Questions

Vue route params does not exist on type function

byMR

May 17, 2023

Questions

Why is my menu button activating across the screen?

byMR

May 17, 2023

Pandas Dataframe – Finding Duplicates of One Column But Different in Another Column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Need to save data from an HTML file but bowser gives an undefined error

entity framework: retrieve Id of new entry upon creation

How can I use a value extracted from a data table to specify columns to subset in R?

Why objectmapper writevalueasstring return empty string?

Vue route params does not exist on type function

Why is my menu button activating across the screen?

Keep Up to Date with the Most Important News

Pandas Dataframe – Finding Duplicates of One Column But Different in Another Column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Need to save data from an HTML file but bowser gives an undefined error

entity framework: retrieve Id of new entry upon creation

How can I use a value extracted from a data table to specify columns to subset in R?

Why objectmapper writevalueasstring return empty string?

Vue route params does not exist on type function

Why is my menu button activating across the screen?

Discover more from Dev solutions