How to make a new df with column name and unique values?

I am attempting to create a new df that shows all columns and their unique values. I have this following code but I think I am referencing the column of the df in the loop wrong.

#Create empty df
df_unique = pd.DataFrame()
#Loop to take unique values from each column and append to df
for col in df:
    list = df(col).unique().tolist()
    df_unique.loc[len(df_unique)] = list

To visualize what I am hoping to achieve, I’ve included a before and after example below.

Before

ID     Name        Zip       Type
01     Bennett     10115     House
02     Sally       10119     Apt
03     Ben         11001     House
04     Bennett     10119     House

After

Column List_of_unique
ID     01,  02,  03,  04
Name   Bennett,  Sally,  Ben
Zip    10115,  10119,  11001
Type   House,  Apt

>Solution :

You can use:

>>> df.apply(np.unique)

ID               [1, 2, 3, 4]
Name    [Ben, Bennett, Sally]
Zip     [10115, 10119, 11001]
Type             [Apt, House]
dtype: object

# OR
>>> (df.apply(lambda x: ', '.join(x.unique().astype(str)))
       .rename_axis('Column').rename('List_of_unique').reset_index())

  Column       List_of_unique
0     ID           1, 2, 3, 4
1   Name  Bennett, Sally, Ben
2    Zip  10115, 10119, 11001
3   Type           House, Apt

Related

Leave a ReplyCancel reply