Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to unstack the table based on specific columns in python

I have a table like this name df1

ID       M1  M2   NO        DTime         Result(P2-P3) (P1-P2) (P1-P7) (P3-P7) 
2801596 288 371  536529 08-02-2023 11:07    1    NaN    0.085    NaN      NaN
2801596 289 371  536529 08-02-2023 11:07    1     1     0.032   1.081     NaN
2801584 290 372  541278 08-02-2023 11:10    1    NaN    0.081    NaN      NaN
2801584 291 372  541278 08-02-2023 11:10    0     1     0.037   1.065     NaN
2801598 288 371  541279 08-02-2023 11:12    1    NaN    0.076    NaN      NaN
2801599 288 371  555623 08-02-2023 11:14    1     1      NaN     NaN     3.871
2801599 289 371  555623 08-02-2023 11:14    1     1      NaN     NaN     2.389
2801600 291 372  555624 08-02-2023 11:18    1    NaN    0.0835   NaN      NaN

I have tried using Pivot table but it is giving a table full of Nan.

df2 = pd.pivot_table(df1, values=[‘Result’,'(P2-P3)’,'(P1-P2)’,'(P1-P7)’,'(P3-P7)’], index=[‘ID’,’No’,’DTime’],columns=[‘M2′,’M1′], aggfunc=’first’)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I want this table as output
enter image description here

>Solution :

Add DataFrame.reorder_levels with DataFrame.sort_index:

df2 = df1.pivot_table(values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'],
                     index=['ID','NO','DTime'],
                     columns=['M2','M1'],
                     aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)

If need processing all columns from input DataFrame not set in index and columns parameter is possible omit values:

df2 = df1.pivot_table(index=['ID','NO','DTime'],
                     columns=['M2','M1'],
                     aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)

Input data:

df1 = pd.DataFrame({'ID': {0: 2801596, 1: 2801596, 2: 2801584, 3: 2801584, 4: 2801598, 5: 2801599, 6: 2801599, 7: 2801600}, 
                    'M1': {0: 288, 1: 289, 2: 290, 3: 291, 4: 288, 5: 288, 6: 289, 7: 291},
                    'M2': {0: 371, 1: 371, 2: 372, 3: 372, 4: 371, 5: 371, 6: 371, 7: 372}, 
                    'NO': {0: 536529, 1: 536529, 2: 541278, 3: 541278, 4: 541279, 5: 555623, 6: 555623, 7: 555624}, 
                    'DTime': {0: '08-02-2023 11:07', 1: '08-02-2023 11:07', 2: '08-02-2023 11:10', 3: '08-02-2023 11:10', 
                              4: '08-02-2023 11:12', 5: '08-02-2023 11:14', 6: '08-02-2023 11:14', 7: '08-02-2023 11:18'}, 
                    'Result': {0: 1, 1: 1, 2: 1, 3: 0, 4: 1, 5: 1, 6: 1, 7: 1}, 
                    '(P2-P3)': {0: np.nan , 1: 1.0, 2: np.nan , 3: 1.0, 4: np.nan , 5: 1.0, 6: 1.0, 7: np.nan }, 
                    '(P1-P2)': {0: 0.085, 1: 0.032, 2: 0.081, 3: 0.037, 4: 0.076, 5: np.nan , 6: np.nan , 7: 0.0835},
                    '(P1-P7)': {0: np.nan , 1: 1.081, 2: np.nan , 3: 1.065, 4: np.nan , 5: np.nan , 6: np.nan , 7: np.nan },
                    '(P3-P7)': {0: np.nan , 1: np.nan , 2: np.nan , 3: np.nan , 4: np.nan , 5: 3.871, 6: 2.389, 7: np.nan }})
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading