I have a pandas dataframe has two columns, code and name,
each duplicated code may have different names,
How can I replace name for each code with last occurrence for each one
| code | name |
|---|---|
| 1 | 3 |
| 1 | 6 |
| 2 | 5 |
| 3 | 4 |
| 1 | 7 |
Required output
| code | name |
|---|---|
| 1 | 7 |
| 1 | 7 |
| 2 | 5 |
| 3 | 4 |
| 1 | 7 |
>Solution :
You can use pd.groupby() to group by the column code and get the last value from the column name for each value of code. Use transform to get the complete column back and save it under the column name:
df['name'] = df.groupby('code').name.transform('last')
print(df)
Output:
code name
0 1 7
1 1 7
2 2 5
3 3 4
4 1 7