Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas return column data as list without duplicates

This is just an oversimplification but I have this large categorical data.

Name   Age Gender 
John    12 Male 
Ana     24 Female
Dave    16 Female
Cynthia 17 Non-Binary
Wayne   26 Male
Hebrew  29 Non-Binary

Suppose that it is assigned as df and I want it to return as a list with non-duplicate values:

'Male','Female','Non-Binary'

I tried it with this code, but this returns the gender with duplicates

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

list(df['Gender'])

How can I code it in pandas so that it can return values without duplicates?

>Solution :

In these cases you have to remember that df["Gender"] is a Pandas Series so you could use .drop_duplicates() to retrieve another Pandas Series with the duplicated values removed or use .unique() to retrieve a Numpy Array containing the unique values.

>> df["Gender"].drop_duplicates()
0         Male 
1        Female
3    Non-Binary
4          Male
Name: Gender, dtype: object

>> df["Gender"].unique()
['Male ' 'Female' 'Non-Binary' 'Male']
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading