Pandas merging value of two rows in columns of a single row

I have data like this, it’s output of a groupby:

numUsers = df.groupby(["user","isvalid"]).count()

                      count     
user       isvalid               
5          0.0         1336  
           1.0          387 

But I need to have count of count_valid and count_invalid columns for each user, like this:

                    count_valid  count_invalid
user 
5                           387           1336
           

How can I do it in optimized way in Pandas?

>Solution :

You can use:

out = (df.groupby(["user","isvalid"]).count()
         .rename({0: 'count_invalid', 1: 'count_valid'}, level=1)
         ['count'].unstack()
       )

Output:

isvalid  count_invalid  count_valid
user                               
5                 1336          387

Or, more generic if you have multiple columns, using a MultiIndex:

out = (df.groupby(["user","isvalid"]).count()
         .unstack().rename(columns={0: 'invalid', 1: 'valid'}, level=1)
       )
out.columns = out.columns.map('_'.join)

Output:

      count_invalid  count_valid
user                            
5              1336          387

Or from the original dataset with a crosstab:

pd.crosstab(df['user'], df['isvalid'].map({0: 'count_invalid', 1: 'count_valid'}))

Leave a Reply