I have a pandas DataFrame like this:
Year1 Year2 Total
0 2010 2011 2500
1 2012 2013 3000
2 2014 2015 4000
I want to grab the data in the Year2
column and merge it with the Year1
column, and keep the Total
value associated with it, which should look like:
Year1 Total
0 2010 2500
1 2011 2500
2 2012 3000
3 2013 3000
4 2014 4000
5 2015 4000
I have considered first of all duplicating the df
so that I get the second ‘Total’ value for the 2011, 2013 and 2015
df = pd.DataFrame(np.repeat(df.values, 2, axis=0))
df.columns = ['Year1', 'Year2', 'Total']
but I’m still unsure of the steps to merge the column data from Year2
to Year1
.
>Solution :
You could melt
it:
out = (pd.melt(df, id_vars=['Total']).rename(columns={'value':'Year1'})
.drop(columns='variable')[['Year1', 'Total']]
.sort_values(by='Year1').reset_index(drop=True))
or set_index
with "Total" + unstack
:
out = (df.set_index('Total').unstack().droplevel(0)
.reset_index(name='Year1')[['Year1', 'Total']]
.sort_values(by='Year1').reset_index(drop=True))
Output:
Year1 Total
0 2010 2500
1 2011 2500
2 2012 3000
3 2013 3000
4 2014 4000
5 2015 4000