Pandas return column data as list without duplicates

February 13, 2022

This is just an oversimplification but I have this large categorical data.

Name   Age Gender 
John    12 Male 
Ana     24 Female
Dave    16 Female
Cynthia 17 Non-Binary
Wayne   26 Male
Hebrew  29 Non-Binary

Suppose that it is assigned as df and I want it to return as a list with non-duplicate values:

'Male','Female','Non-Binary'

I tried it with this code, but this returns the gender with duplicates

list(df['Gender'])

How can I code it in pandas so that it can return values without duplicates?

>Solution :

In these cases you have to remember that df["Gender"] is a Pandas Series so you could use .drop_duplicates() to retrieve another Pandas Series with the duplicated values removed or use .unique() to retrieve a Numpy Array containing the unique values.

>> df["Gender"].drop_duplicates()
0         Male 
1        Female
3    Non-Binary
4          Male
Name: Gender, dtype: object

>> df["Gender"].unique()
['Male ' 'Female' 'Non-Binary' 'Male']