For example:
Say I have a dataframe like
Date Registered | Name | Gift |
---|---|---|
2021-10-30 | John Doe | Money |
2021-10-30 | John Doe | Food |
2021-11-02 | Tyler Blue | Gift Card |
2021-11-02 | Tyler Blue | Food |
2021-12-01 | John Doe | Supplies |
I want to locate all indexes where an entry in name has a unique value in date. Like so:
Date Registered | Name | Gift |
---|---|---|
2021-10-30 | John Doe | Money |
2021-11-02 | Tyler Blue | Gift Card |
2021-12-01 | John Doe | Supplies |
I tried this:
name_view = df.drop_duplicates(subset=['Name', 'DateTime'], keep= 'last')
def extract_name(TableName):
return TableName.duplicated(subset=['Name'])
extract_name(name_view)
But this does not get rid of all the indexes with duplicate dates. Any suggestions? I’m fine with it simply returning a list of the indexes as well, it isn’t required to output the full row.
>Solution :
This will give the requested output as a list of index values:
print(df.reset_index().groupby(['Date Registered','Name']).first()['index'].tolist())
Output:
[0, 2, 4]