Add the remaining quarterly columns that do not have any values

I have monthly time series data as follows:

  Item  2020 Jan  2020 Feb  2020 Mar  2020 Jul  2020 Aug  2020 Sep
0    A         0         1         2         3         4         5
1    B         5         4         3         2         1         0

I needed to convert this monthly data into quarterly data as a sum of three months, so I adapted the code from here to do just that:

df = df.groupby(pd.PeriodIndex(df.columns, freq='Q'), axis=1).sum()

The resulting df is as follows:

  Item  2020 Q1  2020 Q3
0    A        3       12
1    B       12        3

However, I wanted to have all quarters listed in between, not just the ones that have a value larger than 0. So my desired output is as follows:

  Item  2020 Q1  2020 Q2  2020 Q3
0    A        3        0       12
1    B       12        0        3

So in this output, there is an additional column Q2 because it is between the first and the last quarter periods, even though it has no positive values. Does anybody know how to do that?

>Solution :

Use DataFrame.reindex with period_range:

df = df.set_index('Item')
df.columns = pd.to_datetime(df.columns)

df = df.groupby(pd.PeriodIndex(df.columns, freq='Q'), axis=1).sum()
df = df.reindex(pd.period_range(df.columns.min(), df.columns.max(), freq='Q'), axis=1, fill_value=0)
print (df)
      2020Q1  2020Q2  2020Q3
Item                        
A          3       0      12
B         12       0       3

Leave a Reply