Create composite variable from multiple variables and add to dataframe

January 9, 2022

I have a dataframe with three median rent variables. The dataframe looks like this:

region_id	year	1bed_med_rent	2bed_med_rent	3bed_med_rent
1	2010	800	1000	1200
1	2011	850	1050	1250
2	2010	900	1000	1100
2	2011	950	1050	1150

I would like to combine all rent variables into one variable using common elements of region and year like so:

region_id	year	med_rent
1	2010	1000
1	2011	1050
2	2010	1000
2	2011	1050

Using the agg() function in pandas, I have been able to perform functions on multiple variables, but I have not been able to combine variables and insert into the dataframe. I have attempted to use the assign() function in combination with the below code without success.

#Creating the group list of common IDs
group_list = ['region_id', 'year']

#Grouping by common ID and taking median values of each group
new_df = df.groupby(group_list).agg({'1bed_med_rent': ['median'],'2bed_med_rent': 
['median'], '3bed_med_rent': ['median']}).reset_index()

What other method might there be for this?

>Solution :

Here set_index combined with apply applied to the rest of the row ought to do it:

(df.set_index(['region_id','year'])
    .apply(lambda r:r.median(), axis=1)
    .reset_index()
    .rename(columns = {0:'med_rent'})
)

produces

    region_id   year    med_rent
0   1           2010    1000.0
1   1           2011    1050.0
2   2           2010    1000.0
3   2           2011    1050.0