Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas how to create new data frame that only has duplicate ids

I am trying to create a new dataframe that has the columns id and name, for all the duplicate ids in the dataframe.

My dataframes structure is:

id, name,lat, lon, price, minimum_nights, review_cnt

I tried the .duplicated function, but I am not getting what I need. I think I might be using it wrong

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

.duplicated() by default returns all duplicated features except the first feature. To get all duplicated features for ‘id’ and ‘name’ including the first occurrence:

df = df[['id', 'name']].copy()
df[df.duplicated(keep=False)]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading