I have my dataframe df_daily and I’d like to check for duplicates based on my date column which corresponds to df_daily[0] and those rows that would be duplicates due to having the same date I’d like for them to be deleted and maintain only what is not a duplicate. I tried the following but am getting syntax error.
dp_check = df_daily.drop[df_daily[df_daily[0].drop_duplicates(keep = False)].index, inplace = True]
>Solution :
Use Series.duplicated with inverted mask by ~ in boolean indexing:
df_daily = pd.DataFrame({0:[4,5,4,6,5,8]})
dp_check = df_daily[~df_daily[0].duplicated(keep = False)]
Or DataFrame.drop_duplicates with subset parameter:
dp_check = df_daily.drop_duplicates(subset=[0], keep = False)
print (dp_check)
0
3 6
5 8