In python/pandas is there a way to find a row that has a duplicate value in one column and a unique value in another?

For example:

Say I have a dataframe like

Date Registered Name Gift
2021-10-30 John Doe Money
2021-10-30 John Doe Food
2021-11-02 Tyler Blue Gift Card
2021-11-02 Tyler Blue Food
2021-12-01 John Doe Supplies

I want to locate all indexes where an entry in name has a unique value in date. Like so:

Date Registered Name Gift
2021-10-30 John Doe Money
2021-11-02 Tyler Blue Gift Card
2021-12-01 John Doe Supplies

I tried this:

name_view = df.drop_duplicates(subset=['Name', 'DateTime'], keep= 'last')
def extract_name(TableName):
    return TableName.duplicated(subset=['Name']) 
extract_name(name_view)

But this does not get rid of all the indexes with duplicate dates. Any suggestions? I’m fine with it simply returning a list of the indexes as well, it isn’t required to output the full row.

>Solution :

This will give the requested output as a list of index values:

print(df.reset_index().groupby(['Date Registered','Name']).first()['index'].tolist())

Output:

[0, 2, 4]

Leave a Reply