Check if the values in a column are unique, if they are unique add to end of the row if not unique add not unique to end of row

I have a DataFrame that looks like:

   c1   c2   c3
0  10  100  200
1  11  110  233
2  12  120  444
3  33  100  776

I need to go through each row of the DataFrame and check if the value in c2 is unique to just c2 (i.e. there is only one of that value in the entire c2 of that df). If not unique add "not unique" to the end of the row if unique then add unique to then end of the row. Expected output:

   c1   c2   c3     c4
0  10  100  200  not unique
1  11  110  233  unique
2  12  120  444  unique
3  33  100  776  not unique

I have tried a few things so far and have not been able to get the results that i want:

for x in dfs:
    if x["c2"].unique(): #i also tried x[x['c2']]
        dfs["duplicated"] = "unique"
    else:
        dfs["duplicated"] = "not_unique"

or

dfs["Duplicates"] = np.where(dfs.c2.duplicated(), "not_unique", "unique")

>Solution :

Use numpy.where with Series.duplicated:

In [318]: import numpy as np

In [319]: df['c4'] = np.where(df['c2'].duplicated(keep=False), 'not unique', 'unique')

In [320]: df
Out[320]: 
   c1   c2   c3          c4
0  10  100  200  not unique
1  11  110  233      unique
2  12  120  444      unique
3  33  100  776  not unique

Leave a Reply