Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

unique melt and pivot crosstab format with a groupby using pandas

I have a dataset where I would like for some values to become column headers, as well as a crosstab format.

data

year    qtr     ID  type    growth  re     nondd_re   se_re or
2024    2024Q1  NY  aa      3.18    1.14    0         0     0
2024    2024Q2  NY  aa      2.1     1.14    0         0     0
2024    2024Q1  NY  dd      6.26    3.07    3.07      0     0
2024    2024Q2  NY  dd      4.13    3.07    3.07      0     0
2024    2024Q1  CA  aa      0       0       0         0     0
2024    2024Q2  CA  aa      0.03    0       0         0     0
2024    2024Q1  CA  dd      0       0       0         0     0
2024    2024Q2  CA  dd      0.06    0       0         0     0

desired

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ID  type       type 2024Q1  2024Q2
NY  growth     dd   6.26    4.13
NY  nond_ re   dd   3.07    3.07
NY  se_re      dd   0       0
NY  or         dd   0       0
NY  re         dd   3.07    3.07
NY  growth     aa   3.18    2.1
NY  nond_ re   aa   0       0
NY  se_re      aa   0       0
NY  or         aa   0       0
NY  re         aa   1.14    1.14
CA  growth     dd   0       0.6
CA  nond_ re   dd   0       0
CA  se_re      dd   0       0
CA  or         dd   0       0
CA  re         dd   0       0
CA  growth     aa   0       0.3
CA  nond_ re   aa   0       0
CA  se_re      aa   0       0
CA  or         aa   0       0
CA  re         aa   0       0 

doing

Melt the dataframe to transform metrics columns into rows

melted_df = df.melt(id_vars=["year", "qtr", "ID", "type"], var_name="type", value_name="value")

Pivot the melted dataframe

pivot_df = melted_df.pivot_table(index=["ID","type"], columns="qtr", values="value", fill_value=0)

Reset index to turn multi-index into columns

pivot_df = pivot_df.reset_index()

The issue is that all of the values aren’t getting. The above code produces output with missing values
Any suggestion is appreciated

>Solution :

You can try to set_index() + stack()/unstack():

out = (
    df.set_index(["year", "qtr", "ID", "type"])
    .stack()
    .unstack("qtr")
    .reset_index()
    .rename(columns={"level_3": "type2"})
    .rename_axis(columns=None, index=None)
)

print(out)

Prints:

    year  ID type     type2  2024Q1  2024Q2
0   2024  CA   aa    growth    0.00    0.03
1   2024  CA   aa        re    0.00    0.00
2   2024  CA   aa  nondd_re    0.00    0.00
3   2024  CA   aa     se_re    0.00    0.00
4   2024  CA   aa        or    0.00    0.00
5   2024  CA   dd    growth    0.00    0.06
6   2024  CA   dd        re    0.00    0.00
7   2024  CA   dd  nondd_re    0.00    0.00
8   2024  CA   dd     se_re    0.00    0.00
9   2024  CA   dd        or    0.00    0.00
10  2024  NY   aa    growth    3.18    2.10
11  2024  NY   aa        re    1.14    1.14
12  2024  NY   aa  nondd_re    0.00    0.00
13  2024  NY   aa     se_re    0.00    0.00
14  2024  NY   aa        or    0.00    0.00
15  2024  NY   dd    growth    6.26    4.13
16  2024  NY   dd        re    3.07    3.07
17  2024  NY   dd  nondd_re    3.07    3.07
18  2024  NY   dd     se_re    0.00    0.00
19  2024  NY   dd        or    0.00    0.00
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading