Change dataframe values based on other dataframes

April 19, 2023

Let’s say we have the following "df1" dataframe with cities as column names:

            NY  LA  Rome  London  Milan
date                                   
2023-01-01   1  81    26      55     95
2023-01-02  92  42    96      98      7
2023-01-03  14   4    60      88     73

In another "df2" dataframe I have cities and their countries:

     City Country
0      NY      US
1      LA      US
2  London      UK
3    Rome   Italy
4   Milan   Italy

In a third "df3" dataframe I have some values for each country and each date:

            US  UK  Italy
date                     
2023-01-01  70  41     32
2023-01-02  98  46     45
2023-01-03  83  50     17

My output dataframe has the same strutcture as the first dataframe. This is the expected output:

            NY  LA  Rome  London  Milan
date                                   
2023-01-01 -69  11    -6      14     63
2023-01-02  -6 -56    51      52    -38
2023-01-03 -69 -79    43      38     56

For example, the 51 value for "Rome" on 2023-01-02 is the difference between the value of the same cell from df1 (96) and the value of the country where Rome is located (Italy) on 2023-01-02 (45).

Any help? Thanks.

>Solution :

Try stack df3, merge it with df2 to get the correct data for the cities, then you can sub:

df1.sub(df3.stack().reset_index(name='value')
    .merge(df2, left_on='level_1', right_on='Country')
    .pivot(index='date', columns='City', values='value')
)

Output:

            LA  London  Milan  NY  Rome
date                                   
2023-01-01  11      14     63 -69    -6
2023-01-02 -56      52    -38  -6    51
2023-01-03 -79      38     56 -69    43