Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

expand row based on integer in column and split into number of months between dates

I have the following dataframe:

id date_start date_end reporting_month reporting_month_number months_length
1 2022-03-31 23:56:22 2022-05-01 23:56:22 2022-03 1 3
2 2022-03-31 23:48:48 2022-06-01 23:48:48 2022-03 1 4
3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-03 1 6

I would like to split each id row so I can have a row for each of the months_length, starting on the date of reporting_month, like this:

id date_start date_end reporting_month reporting_month_number months_length
1 2022-03-31 23:56:22 2022-05-01 23:56:22 2022-03 1 3
1 2022-03-31 23:56:22 2022-05-01 23:56:22 2022-04 2 3
1 2022-03-31 23:56:22 2022-05-01 23:56:22 2022-05 3 3
2 2022-03-31 23:48:48 2022-06-01 23:48:48 2022-03 1 4
2 2022-03-31 23:48:48 2022-06-01 23:48:48 2022-03 2 4
2 2022-03-31 23:48:48 2022-06-01 23:48:48 2022-04 3 4
2 2022-03-31 23:48:48 2022-06-01 23:48:48 2022-05 4 4
3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-03 1 6
3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-04 2 6
3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-05 3 6
3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-06 4 6
3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-07 5 6
3 2022-03-31 23:47:36 2022-08-01 23:47:36 2022-08 6 6

I have tried several approaches but I can’t seem to reach my objective.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Does anyone have a suggestion on how to achieve this?

Thanks.

>Solution :

One possible solution is,

df= df.loc[df.index.repeat(df['months_length'])].reset_index(drop=True)
df['reporting_month_number'] = df.groupby('id')['reporting_month_number'].cumsum()

O/P:

    id           date_start             date_end reporting_month  \
0    1  2022-03-31 23:56:22  2022-05-01 23:56:22         2022-03   
1    1  2022-03-31 23:56:22  2022-05-01 23:56:22         2022-03   
2    1  2022-03-31 23:56:22  2022-05-01 23:56:22         2022-03   
3    2  2022-03-31 23:48:48  2022-06-01 23:48:48         2022-03   
4    2  2022-03-31 23:48:48  2022-06-01 23:48:48         2022-03   
5    2  2022-03-31 23:48:48  2022-06-01 23:48:48         2022-03   
6    2  2022-03-31 23:48:48  2022-06-01 23:48:48         2022-03   
7    3  2022-03-31 23:47:36  2022-08-01 23:47:36         2022-03   
8    3  2022-03-31 23:47:36  2022-08-01 23:47:36         2022-03   
9    3  2022-03-31 23:47:36  2022-08-01 23:47:36         2022-03   
10   3  2022-03-31 23:47:36  2022-08-01 23:47:36         2022-03   
11   3  2022-03-31 23:47:36  2022-08-01 23:47:36         2022-03   
12   3  2022-03-31 23:47:36  2022-08-01 23:47:36         2022-03   

    reporting_month_number  months_length  
0                        1              3  
1                        2              3  
2                        3              3  
3                        1              4  
4                        2              4  
5                        3              4  
6                        4              4  
7                        1              6  
8                        2              6  
9                        3              6  
10                       4              6  
11                       5              6  
12                       6              6  


Explanation:

  1. Repeat rows based on months_length
  2. Update Reporing Month Number based on groupby ‘id’
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading