Home Groupby and 'transpose the data to a new dataframe and creating column names

Questions

Groupby and 'transpose the data to a new dataframe and creating column names

January 18, 2024

I’ll preface this by saying I’d consider myself inexperienced in both python and coding in general.

I have a dataframe that I want groupby the site and then move the data into a new dataframe that uses the year as the column headings, I can’t think what this would be called, its not really a transpose as there can be years missing (unless you can somehow transpose and match column names to df[‘year’])

df =

site	year	data
a	2010	10
a	2011	12
a	2012	7
a	2013	45
b	2010	21
b	2012	45
b	2013	54
c	2010	17
c	2011	32
c	2012	24

What I want to end up with is something like this

site	2010	2011	2012	2013
a	10	12	7	45
b	21		45	54
c	17	32	24

This gets me some of the way there, but will not work where there is missing data – I’m not sure how to proceeed and incorporate the taking the years and creating the new columns.

df.groupby(‘site’)[‘data’].apply(lambda df: df.reset_index(drop=True)).unstack()

I feel it shouldn’t be difficult, and that I just can’t come up with the right keywords to search for the answer.

>Solution :

import pandas as pd

df = pd.DataFrame({
"site": ["a", "a", "a", "a", "b", "b", "b", "c", "c", "c"],
"year": [2010, 2011, 2012, 2013, 2010, 2012, 2013, 2010, 2011, 2012],
"data": [10, 12, 7, 45, 21, 45, 54, 17, 32, 24]})

df = df.set_index(["site", "year"])
new_df = df.pivot_table(values="data", index="site", columns="year",     
fill_value=0)
new_df = new_df.sort_index(axis=1)

print(new_df)

This should help you.