Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I convert time series data from wide to long format using python (pandas package)?

I have some data taken at different time points in wide format, and need to convert it to long format to aid with analysis and to merge it with another dataset.

The format of the data is (where A_0 means value of A at time 0, A_15 means value at time 15):

import pandas as pd

df_wide = pd.DataFrame({'Subject': ['AA', 'BB', 'CC', 'DD'],
               'A_0': [1, 2, 3, 4],
               'A_15': [2, 3, 4, 5],
               'A_30': [3, 4, 5, 6],
               'B_0': [1, 2, 3, 4],
               'B_15': [2, 3, 4, 5],
               'B_30': [3, 4, 5, 6],
               'C_0': [1, 2, 3, 4],
               'C_15': [2, 3, 4, 5],
               'C_30': [3, 4, 5, 6]
                  
              }
             )

df_wide
 Subject    A_0 A_15    A_30    B_0 B_15    B_30    C_0 C_15    C_30
0   AA  1   2   3   1   2   3   1   2   3
1   BB  2   3   4   2   3   4   2   3   4
2   CC  3   4   5   3   4   5   3   4   5
3   DD  4   5   6   4   5   6   4   5   6

I wish to convert this to long format and generate the variable time as follow:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df_long = pd.DataFrame({'Subject': ['AA', 'AA', 'AA', 'BB', 'BB', 'BB', 'CC', 'CC', 'CC', 'DD', 'DD',       'DD'],
                    'Time': [0, 15, 30, 0, 15, 30, 0, 15, 30, 0, 15, 30],
                    'A': [1, 2, 3, 2, 3, 4, 3, 4, 5, 4, 5, 6],
                    'B': [1, 2, 3, 2, 3, 4, 3, 4, 5, 4, 5, 6],
                    'C': [1, 2, 3, 2, 3, 4, 3, 4, 5, 4, 5, 6],       
                    })

df_long
 Subject    Time    A   B   C
0   AA  0   1   1   1
1   AA  15  2   2   2
2   AA  30  3   3   3
3   BB  0   2   2   2
4   BB  15  3   3   3
5   BB  30  4   4   4
6   CC  0   3   3   3
7   CC  15  4   4   4
8   CC  30  5   5   5
9   DD  0   4   4   4
10  DD  15  5   5   5
11  DD  30  6   6   6

I’ve read the pivot and melt functions but can’t quite get my head around it – any assistance would be greatly appreciated.

Have tired pivot and melt but unsure how to generate the time variable.

>Solution :

You can use pd.wide_to_long:

x = pd.wide_to_long(
    df_wide, i="Subject", j="Time", stubnames=["A", "B", "C"], sep="_"
)
print(x)

Prints:

              A  B  C
Subject Time         
AA      0     1  1  1
BB      0     2  2  2
CC      0     3  3  3
DD      0     4  4  4
AA      15    2  2  2
BB      15    3  3  3
CC      15    4  4  4
DD      15    5  5  5
AA      30    3  3  3
BB      30    4  4  4
CC      30    5  5  5
DD      30    6  6  6
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading