Home Filtering data-frame columns using regex, then using .groupby to calculate sum

Questions

Filtering data-frame columns using regex, then using .groupby to calculate sum

December 8, 2022

I have a dataframe which I want to group, filter columns by regex, and then sum.

My code looks like this:

import pandas as pd

df = pd.DataFrame({'ID':[1,1,2,2,3,3], 
                   'Invasive' : [12,1,1,0,1,0], 
                   'invasive': [1,4,5,3,4,6],
                   'Wild':[4,7,1,0,0,0],
                   'wild':[0,0,9,8,3,2],
                   'Crop':[0,0,0,0,0,0],
                   'Crop_2':[2,3,2,2,1,2]})

df.groupby(['ID']).filter(regex='(Invasive)|(invasive)|(Wild)|(wild)').sum()

The error message I get is:

DataFrameGroupBy.filter() missing 1 required positional argument: 'func'

I get the same Err message if groupby comes after filter

Why does this happen? Where do I input the func argument?

EDIT:

My Expected output is one column that has summed across the filtered columns and is grouped by ID. E.g.:

   ID  Output
0   1      29
1   2      27
2   3      16

>Solution :

What you want to do doesn’t make sense, groupby.filter is to filter rows, not to be confused with DataFrame.filter.

You likely want to filter the columns, then to aggregate:

df.filter(regex='(?i)(Invasive|Wild)').groupby(df['ID']).sum()

NB. I replaced (Invasive)|(invasive)|(Wild)|(wild) by (?i)(Invasive|Wild), which means ‘InvasiveORWild` independently of the case.

Output:

    Invasive  invasive  Wild  wild
ID                                
1         13         5    11     0
2          1         8     1    17
3          1        10     0     5

edit:

the output that you show needs a further summation per row:

out = (df.filter(regex='(?i)(Invasive|Wild)')
         .groupby(df['ID']).sum()
         .sum(axis=1)
         .reset_index(name='Output')
      )

# or with summation before:
out = (df.filter(regex='(?i)(Invasive|Wild)')
         .sum(axis=1)
         .groupby(df['ID']).sum()
         .reset_index(name='Output')
      )

Output:

   ID  Output
0   1      29
1   2      27
2   3      16

pandas

byMR

Published December 08, 2022

Add a comment

3D Medical Brain Image Metadata Mask Information

byMR

December 8, 2022

Questions

Insert a space character before and after a specific string, yet preserve string, in one update SQL statement (MS SQL)

byMR

December 8, 2022

Questions

python selenium send_keys error: list object has no attribute

byMR

December 8, 2022

Questions

Using IResult as return value for "classic" api instead of Minimal API

byMR

December 8, 2022

Questions

Why do the values of x and y not decrement in this code?

byMR

December 8, 2022

Questions

Copy text without new line

byMR

December 8, 2022

Filtering data-frame columns using regex, then using .groupby to calculate sum

MEDevel.com: Open-source for Healthcare and Education

>Solution :

edit:

Like this:

Leave a ReplyCancel reply

Read more

3D Medical Brain Image Metadata Mask Information

Insert a space character before and after a specific string, yet preserve string, in one update SQL statement (MS SQL)

python selenium send_keys error: list object has no attribute

Using IResult as return value for "classic" api instead of Minimal API

Why do the values of x and y not decrement in this code?

Copy text without new line

Keep Up to Date with the Most Important News

Filtering data-frame columns using regex, then using .groupby to calculate sum

MEDevel.com: Open-source for Healthcare and Education

>Solution :

edit:

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

3D Medical Brain Image Metadata Mask Information

Insert a space character before and after a specific string, yet preserve string, in one update SQL statement (MS SQL)

python selenium send_keys error: list object has no attribute

Using IResult as return value for "classic" api instead of Minimal API

Why do the values of x and y not decrement in this code?

Copy text without new line

Discover more from Dev solutions