Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create a new dataframe with number of rows based on the months of current dataframe

I have this input data with the month interval defined by values in 2 columns as such:

Customer ID Order quantity  Order valid from    Order valid to
Customer 1  20  Aug-21  Dec-21
Customer 2  50  Oct-21  Jan-22

I would like to create a new dataframe where each of the month in the interval is a new row by itself while retaining data on other columns. The output should look like this:

Customer    Month   Order quantity
Customer 1  Aug-21  20
Customer 1  Sep-21  20
Customer 1  Oct-21  20
Customer 1  Nov-21  20
Customer 1  Dec-21  20
Customer 2  Oct-21  50
Customer 2  Nov-21  50
Customer 2  Dec-21  50
Customer 2  Jan-22  50

Super grateful for any help!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

df = df.assign(Month=df.apply(lambda x: pd.date_range(x['Order valid from'], x['Order valid to'], freq='M'), axis=1)).explode('Month').drop(['Order valid from', 'Order valid to'], axis=1).reset_index(drop=True)

Output:

>>> df
  Customer ID  Order quantity      Month
0  Customer 1              20 2021-08-31
1  Customer 1              20 2021-09-30
2  Customer 1              20 2021-10-31
3  Customer 1              20 2021-11-30
4  Customer 2              50 2021-10-31
5  Customer 2              50 2021-11-30
6  Customer 2              50 2021-12-31
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading