I am trying to replace the first occurance based on the ID
. My dataset looks like this:
df=
Index ID Status
0 1895001 review
1 1895001 review
2 1895001 review
3 2104264 review
4 2102404 review
5 2102404 review
6 1809905 review
7 1809905 review
8 1809905 review
9 1811700 review
I tried this df.values[df.index, np.argmax(df.values=="review",1)] = "first review"
, but it replaces all of them 🙁
This is what I am expecting:
df=
Index ID Status
0 1895001 first review
1 1895001 review
2 1895001 review
3 2104264 first review
4 2102404 first review
5 2102404 review
6 1809905 first review
7 1809905 review
8 1809905 review
9 1811700 first review
>Solution :
Use boolean indexing with the boolean inverse (~
) of duplicated
:
df.loc[~df['ID'].duplicated(), 'Status'] = 'first review'
Output:
Index ID Status
0 0 1895001 first review
1 1 1895001 review
2 2 1895001 review
3 3 2104264 first review
4 4 2102404 first review
5 5 2102404 review
6 6 1809905 first review
7 7 1809905 review
8 8 1809905 review
9 9 1811700 first review