I am looking to groupby several columns and take the sum based off of categorical values within a column.
Data
name size type
AA 9385 FALSE
AA 9460 FALSE
AA 9572 TRUE
AA 9680
BB 10 TRUE
BB 10 TRUE
BB 20 FALSE
BB 20 FALSE
Desired
name size type
AA 9572 TRUE
AA 18845 FALSE
AA 9680
BB 20 TRUE
BB 40 FALSE
BB
Doing
df = df.groupby('name').agg({'size': 'sum', 'type': lambda x: x.value_counts().idxmax()})
However, this appears to have removed Null values. Any suggestion is appreciated.
>Solution :
Use dropna=False in groupby:
df.groupby(['name', 'type'], dropna=False, as_index=False)['size'].sum()
Output:
name type size
0 AA False 18845
1 AA True 9572
2 AA NaN 9680
3 BB False 40
4 BB True 20