Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas: reshaping data pandas

I have the following dataset:

df =
id test score date
1  A    1     2000-01-01
1  B    3     2000-01-01
1  A    1     2000-01-30
1  B    5     2000-01-30
2  A    2     2000-01-01
2  B    4     2000-01-01
3  A    6     2000-01-01
4  A    2     2000-01-01

I want to group_by id and date in order to get columns containing the type of the test for the same id and date:

df_outcome =
id testA testB date
1  1     3     2000-01-01
1  1     5     2000-01-30
2  2     4     2000-01-01

note that individuals 3 and 4 do not have test A and B in the same date and therefore are excluded from the analysis.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

(df.pivot(index=["id", "date"], columns="test", values="score")
   .dropna()
   .rename_axis(columns=None)
   .add_prefix("test")
   .reset_index()
   .convert_dtypes())
  • pivot over id & date with test variables in the column, scores as the content
  • NaN will arise if a id-date pair doesn’t have A or B score: so drop them
  • move the "test" from the name of the columns to be the prefixes of the column names, i.e., testA, testB
  • move id & date back to the columns side
  • due to possible NaNs, things got floated, so convert datatypes for depromotion if possible, i.e., float to int here

to get

   id        date  testA  testB
0   1  2000-01-01      1      3
1   1  2000-01-30      1      5
2   2  2000-01-01      2      4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading