Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas drop duplicates based on one group and keep the last value

I have a dataframe:

import pandas as pd

data = pd.DataFrame({"col1": ["a", "a", "a", "a", "a", "a"],
                     "col2": [0,0,0,1,1, 1],
                     "col3": [1,2,3,4,5, 6]})

data


  col1  col2    col3
0   a   0   1
1   a   0   2
2   a   0   3
3   a   1   4
4   a   1   5
5   a   1   6

I’m trying to remove the duplicates based on col2 == 1 and keep the last entry

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Using the below code I was able to keep the first and drop others.

data[~(data.duplicated(["col2"]) & data.col2.eq(1))]
col1    col2    col3
0   a   0   1
1   a   0   2
2   a   0   3
3   a   1   4

How to remove duplicates based on one category in a column and keep the last entry?

Desired Output

  col1  col2    col3
0   a   0   1
1   a   0   2
2   a   0   3
3   a   1   6

>Solution :

Use boolean indexing with help of groupby.cumcount (or duplicated):

# is the row not a 1 in col2?
m1 = data['col2'].ne(1)
# is the row the last of the group?
m2 = data.groupby('col2').cumcount(ascending=False).eq(0)
# or
# m2 = ~data['col2'].duplicated(keep='last')

# keep rows matching either condition
out = data[m1|m2]

Reversed logic:

# is the group a 1?
m1 = data['col2'].eq(1)
# is the row NOT the last one?
m2 = data['col2'].duplicated(keep='last')

# drop the rows matching both conditions
out = data[~(m1&m2)]

Output:

  col1  col2  col3
0    a     0     1
1    a     0     2
2    a     0     3
5    a     1     6
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading