Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I melt a dataframe from wide to long twice on the same column?

I have a dataframe:

data = [
    [1, "2022-04-29", 123, "circle", 1, 3, 6, 7.3],
    [1, "2022-02-10", 456, "square", 4, np.nan, 3, 9],
]

df = pd.DataFrame(
    data,
    columns=[
        "ID",
        "date",
        "code",
        "shape",
        "circle_X_rating",
        "circle_Y_rating",
        "square_X_rating",
        "square_Y_rating",
    ],
)
df


ID date      code shape circle_X_rating circle_Y_rating square_X_rating square_Y_rating
1 2022-04-29  123 circle       1               3.0             6              7.3
1 2022-02-10  456 square       4               NaN             3              9.0

I would like to melt this dataframe such that there is a shape column and 2 columns for rating X_rating and Y_rating, and I am not sure how to do this. Currently I am melting it and this is what I get:

test = (
    pd.melt(
        df,
        id_vars=[
            "ID",
            "date",
            "bar_code",
            "shape",
        ],
        value_vars=[
            "circle_X_rating",
            "circle_Y_rating",
            "square_X_rating",
            "square_Y_rating",
        ],
        var_name="shape_for_rating",
        value_name="shape_rating",
    )
    .assign(
        shape_for_rating=lambda df: df["shape"].apply(lambda a_str: a_str.split("_")[0])
    )
    .query("shape == shape")
    .drop(columns=["shape_for_rating"])
)
test

    ID  date        code    shape   shape_rating
0   1   2022-04-29  123     circle      1.0
1   1   2022-02-10  456     square      4.0
2   1   2022-04-29  123     circle      3.0
3   1   2022-02-10  456     square      NaN
4   1   2022-04-29  123     circle      6.0
5   1   2022-02-10  456     square      3.0
6   1   2022-04-29  123     circle      7.3
7   1   2022-02-10  456     square      9.0

But what I’d really like is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    ID  date        code    shape   X_rating   Y_rating
0   1   2022-04-29  123     circle      1.0       3
1   1   2022-04-29  123     square      6.0      7.3
2   1   2022-02-10  456     circle      4        NaN
3   1   2022-02-10  456     square      3         9
...

Does anyone know the best way to do this? I’ve been spinning my wheels at this.

>Solution :

Try wide_to_long

df.columns = df.columns.str.split('_',n=1).map(lambda x : '_'.join(x[::-1]))

df = pd.wide_to_long(df, 
                    stubnames = ['X_rating','Y_rating'], 
                    i = ['ID', 'date', 'code', 'shape'], 
                    j = 'shape1',
                    sep = '_'
                    suffix = r'\w+').reset_index()
df
Out[84]: 
   ID        date  code   shape   shape1  X_rating  Y_rating
0   1  2022-04-29   123  circle   circle         1       3.0
1   1  2022-04-29   123  circle   square         6       7.3
2   1  2022-02-10   456  square   circle         4       NaN
3   1  2022-02-10   456  square   square         3       9.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading