Replace duplicate values with Null and keep only last value

February 9, 2023

I have a dataframe which looks like this:

   ID      Date       EoM_Val    Val
---------------------------------------
  AAA   2021-06-30    3946750     3
  AAA   2021-06-30    3946750     2
  BBB   2021-06-30    2792182     6
  BBB   2021-06-30    2792182     7

And basically, I want to only keep the the last EoM_Val based on the ID and Date columns, such that the final output looks like this:

   ID      Date       EoM_Val    Val
---------------------------------------
  AAA   2021-06-30                3
  AAA   2021-06-30    3946750     2
  BBB   2021-06-30                6
  BBB   2021-06-30    2792182     7

Is there a good way of doing this? Any help would be greatly appreciated, thanks!

>Solution :

I try this code, I hope It may use

df=pd.DataFrame({"ID":["AAA","AAA","BBB","BBB"],"Date":["2021-06-30","2021-06-30","2021-06-30","2021-06-30"],"EoM_Val":[3946750,3946750,2792182,2792182],"Val":[3,2,6,7]})
df2=df.drop_duplicates(["EoM_Val"],keep="first")
indices = list(df2.index)
df.loc[indices,'EoM_Val'] = ""
print(df)

I got output like this

   ID      Date       EoM_Val    Val
---------------------------------------
  AAA   2021-06-30                3
  AAA   2021-06-30    3946750     2
  BBB   2021-06-30                6
  BBB   2021-06-30    2792182     7