Hello there I have a data frame that appears as follows:
| AOI | Year | Hectares | |
|---|---|---|---|
| 1 | Quneitra | 2015 | 11.46 |
| 2 | Quneitra | 2016 | 12.35 |
| 3 | Quneitra | 2017 | 14.65 |
| 4 | Hamah | 2015 | 1.8 |
| 5 | Hamah | 2016 | 2.7 |
| 6 | Hamah | 2017 | 3.5 |
| 9 | Tartus | 2015 | 12.2 |
| 10 | Tartus | 2016 | 12.7 |
| 11 | Tartus | 2017 | 14.2 |
How can I change this dataframe to remove the duplicate names of the AOI column without affecting the data on the adjacent columns
>Solution :
You can use:
df.loc[df['AOI'].duplicated(), 'AOI'] = ''
but be aware that this should be done display only, this will otherwise prevent you to use your data for analysis!
As a temporary way (to avoid modifying the original DataFrame):
df2 = df.assign(AOI=df['AOI'].mask(df['AOI'].duplicated(), ''))
output:
AOI Year Hectares
1 Quneitra 2015 11.46
2 2016 12.35
3 2017 14.65
4 Hamah 2015 1.80
5 2016 2.70
6 2017 3.50
9 Tartus 2015 12.20
10 2016 12.70
11 2017 14.20