YOY growth based on ID

February 9, 2022

I am trying to calculate year or year growth for a variable in a Pandas dataframe. My data looks like this:

Year	Country	Industry	Value
2000	USA	Manufacturing	5
2000	Mexico	Manufacturing	10
2001	Mexico	Manufacturing	15
2002	Mexico	Other	20

I have different number of observations depending on the Country or Industry. Expected output:

Year	Country	Industry	Value	YOY
2000	USA	Manufacturing	5	NaN
2000	Mexico	Manufacturing	10	NaN
2001	Mexico	Manufacturing	15	50%
2002	Mexico	Other	20	NaN

I tried different things including:

df.groupby(['Country','Industry','Year'])['Value'].pct_change()

df['YOY'] = (df['Value'] - df.sort_values(by=['Country','Industry','Year']).groupby(['Country','Industry'])['Value'].shift(1))) / df['Value']

The first line calculates growth between rows without resetting for a new Country or Industry. The second one has incoherent results.

Any lead I could take? Thanks!!

>Solution :

Try this:

df['YOY'] = df.groupby(['Country','Industry'])['Value'].pct_change().mul(100)

Output:

>>> df
   Year Country       Industry  Value   YOY
0  2000     USA  Manufacturing      5   NaN
1  2000  Mexico  Manufacturing     10   NaN
2  2001  Mexico  Manufacturing     15  50.0
3  2002  Mexico          Other     20   NaN