Selection of columns

I work with Pandas dataframe.I want to aggregate data by one column and after that to summarize other columns.You can see example below:

    data = {'name': ['Company1', 'Company2', 'Company1', 'Company2', 'Company5'], 
            'income': [0, 180395, 4543168, 7543168, 73], 
            'turnover': [4, 24, 31, 2, 3]}
    df = pd.DataFrame(data, columns = ['name', 'income', 'turnover'])
    df

INCOME_GROUPED = df.groupby(['name']).agg({'income':sum,'turnover':sum})

So this code above work well and give good result. Now next step is selection. I want to select only to columns from INCOME_GROUPED dataframe.

INCOME_SELECT =  INCOME_GROUPED[['name','income']]

But after execution this line of code I got this error:

"None of [Index(['name', 'income'], dtype='object')] are in the [columns]"

So can anybody help me how to solve this problem ?

>Solution :

You need to call reset_index() after agg():

INCOME_GROUPED = df.groupby(['name']).agg({'income':sum,'turnover':sum}).reset_index()
#                                                                       ^^^^^^^^^^^^^^ add this

Output:

>>> INCOME_GROUPED[['name', 'income']]
       name   income
0  Company1  4543168
1  Company2  7723563
2  Company5       73

Leave a Reply