Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Organising dataframe with pandas and pivot table

I would like to modified my dataframe using melt and table pivot.

I have this dataframe :

df = pd.DataFrame({'Pays': {0: 'France', 1: 'France', 2: 'France'},
                   'Indicateur': {0: 'Internet', 1: 'Pop', 2: 'Eco'},
                   '1990': {0: 1, 1: 2, 2: 3},
                   '1995': {0: 7, 1: 8, 2: 9}})
df

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I gather columns date into rows with melt :

df = pd.melt(df, id_vars =['Pays','Indicateur'], value_vars =['1990','1995'],
              var_name ='Years', value_name ='valeur')
df

enter image description here

I spread ‘Indicateur’ rows into columns :

df2 = df.pivot(columns='Indicateur',values='valeur')
df2

enter image description here

I would like this result :

enter image description here

You know how to do it ?

Thank you !

>Solution :

First is not necessary defined years in value_vars in DataFrame.melt, if need all of them:

df = df.melt(id_vars =['Pays','Indicateur'], var_name ='Years', value_name ='valeur')

Also is necessary defined index parameter in DataFrame.pivot, last convert MultiIndex to columns by DataFrame.reset_index and then remove column name by DataFrame.rename_axis:

df2 = (df.pivot(index=['Pays', 'Years'], columns='Indicateur',values='valeur')
         .reset_index()
         .rename_axis(columns=None))
print(df2)
     Pays Years  Eco  Internet  Pop
0  France  1990    3         1    2
1  France  1995    9         7    8

EDIT: If need same format like melted DataFrame:

df1 = df[['Pays','Years']].join(df.set_index('Indicateur', append=True)['valeur'].unstack())

#or using your solution
df1 = df[['Pays','Years']].join(df.pivot(columns='Indicateur',values='valeur'))
print(df1)
     Pays Years  Eco  Internet  Pop
0  France  1990  NaN       1.0  NaN
1  France  1990  NaN       NaN  2.0
2  France  1990  3.0       NaN  NaN
3  France  1995  NaN       7.0  NaN
4  France  1995  NaN       NaN  8.0
5  France  1995  9.0       NaN  NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading