I have the following DataFrame in pandas:
| code | town | village | city |
|---|---|---|---|
| 01 | Brunete | NaN | NaN |
| 02 | NaN | Cabrera d’Anoia | NaN |
| 03 | NaN | NaN | Barcelona |
| 04 | Zarzalejo | NaN | Madrid |
| 07 | Melilla | NaN | City of Melilla |
| 08 | Cartagena | Galifa | Region of Murcia |
I want to merge the columns town, city and village into one. In case more than 1 column contains a value other than NaN, the resulting column will be assigned the value in this order of preference: village > town > city. Example:
| code | merged_column |
|---|---|
| 01 | Brunete |
| 02 | Cabrera d’Anoia |
| 03 | Barcelona |
| 04 | Zarzalejo |
| 07 | Melilla |
| 08 | Galifa |
>Solution :
Let us define the order of columns then select those columns and use backfill along columns axis
order = ['village', 'town', 'city']
df['merged'] = df[order].bfill(axis=1).iloc[:, 0]
code town village city merged
0 1 Brunete NaN NaN Brunete
1 2 NaN Cabrera d'Anoia NaN Cabrera d'Anoia
2 3 NaN NaN Barcelona Barcelona
3 4 Zarzalejo NaN Madrid Zarzalejo
4 7 Melilla NaN Melilla Melilla
5 8 Cartagena Galifa Region of Murcia Galifa