Home Drop dublicates in pandas but keep duplicates between years

Questions

Drop dublicates in pandas but keep duplicates between years

October 12, 2022

I have a dataset like this

ID     Year    Day
 1     2001    150
 2     2001    140
 3     2001    120
 3     2002    160
 3     2002    160
 3     2017     75
 3     2017     75
 4     2017     80

I would like to drop the duplicates within each year, but keep those were the year differs. End result would be this:

 1     2001    150
 2     2001    140
 3     2001    120
 3     2002    160
 3     2017     75
 4     2017     80

I tried to do something like this in python with my pandas dataframe:

data = read_csv('data.csv')
data = data.drop_duplicates(subset = ['ID'], keep = first)

But this will delete duplicates between years, while I would like to keep this.

How do I keep the duplicates between years?

>Solution :

add ‘year’ in your subset.

data.drop_duplicates(subset = ['ID','Year'], keep = 'first')

    ID  Year    Day
0   1   2001    150
1   2   2001    140
2   3   2001    120
3   3   2002    160
5   3   2017    75
7   4   2017    80

dataframe

byMR

Published October 12, 2022

Add a comment

Update multiple columns with both static values and values from other columns

byMR

October 12, 2022

Questions

Passing Arrays as Parameters and Returning Them in C++

byMR

October 12, 2022

Questions

How to loop all files in current directory and move it to a folder named as the first word of the file?

byMR

October 12, 2022

Questions

why can't it not find this cookie?

byMR

October 12, 2022

Questions

getUTCMonth() and getUTCDate draws an error when comparing certain dates

byMR

October 12, 2022

Questions

Using JOIN and REGEXEXTRACT with ARRAYFORMULA to Switch First and Last Names Not Working

byMR

October 12, 2022

Drop dublicates in pandas but keep duplicates between years

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Update multiple columns with both static values and values from other columns

Passing Arrays as Parameters and Returning Them in C++

How to loop all files in current directory and move it to a folder named as the first word of the file?

why can't it not find this cookie?

getUTCMonth() and getUTCDate draws an error when comparing certain dates

Using JOIN and REGEXEXTRACT with ARRAYFORMULA to Switch First and Last Names Not Working

Keep Up to Date with the Most Important News

Drop dublicates in pandas but keep duplicates between years

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Update multiple columns with both static values and values from other columns

Passing Arrays as Parameters and Returning Them in C++

How to loop all files in current directory and move it to a folder named as the first word of the file?

why can't it not find this cookie?

getUTCMonth() and getUTCDate draws an error when comparing certain dates

Using JOIN and REGEXEXTRACT with ARRAYFORMULA to Switch First and Last Names Not Working

Discover more from Dev solutions