Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Groupby & Sum – Create new column with added If Condition

I have the below DataFrame:

ID  Start           End              Variance
1   100000          120000           20000
1   1               0                -1
1   7815.58         7815.58          0
1   5261            5261             0
1   138783.2        89969.37         -48813.83
1   2459.92         2459.92          0
2   101421.99       93387.45         -8034.54
2   940.04          940.04           0
2   63.06           63.06            0
2   2454.86         2454.86          0
2   830             830              0
2   299             299              0
2   14000           12000            2000
2   1500            500              1000


I want to create a new column, Overspend Total. But I only want to sum the values that are greater than 0. The resulting DataFrame will look like this:

ID  Start           End              Variance        Overspend Total
1   100000          120000           20000           20000
1   1               0                -1              20000
1   7815.58         7815.58          0               20000
1   5261            5261             0               20000
1   138783.2        89969.37         -48813.83       20000
1   2459.92         2459.92          0               20000
2   101421.99       93387.45         -8034.54        3000
2   940.04          940.04           0               3000
2   63.06           63.06            0               3000
2   2454.86         2454.86          0               3000
2   830             830              0               3000
2   299             299              0               3000
2   14000           12000            2000            3000
2   1500            500              1000            3000

I tried the following

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['Overspend Variance'] = df[df['Variance'] > 0].groupby(df['ID']).transform('sum')

But I’m getting the below error:

ValueError: Wrong number of items passed 8, placement implies 1

I know df['Overspend Variance'] = df['Variance'].groupby(df['ID']).transform('sum') would work without a condition, but I can’t figure out how to incorporate it with an extra condition.

>Solution :

We can use Series.where to replace the values that don’t match the condition with NaN, then just groupby transform ‘sum’ since NaN values are ignored by ‘sum’ by default:

df['Overspend Total'] = (
    df['Variance'].where(df['Variance'] > 0).groupby(df['ID']).transform('sum')
)

Or explicitly replace with the additive identity (0) which will not affect the sum:

df['Overspend Total'] = (
    df['Variance'].where(df['Variance'] > 0, 0)
        .groupby(df['ID']).transform('sum')
)

Or with a lambda inside groupby transform:

df['Overspend Total'] = df.groupby('ID')['Variance'].transform(
    lambda s: s[s > 0].sum()
)

In any case df is:

    ID      Start        End  Variance  Overspend Total
0    1  100000.00  120000.00  20000.00          20000.0
1    1       1.00       0.00     -1.00          20000.0
2    1    7815.58    7815.58      0.00          20000.0
3    1    5261.00    5261.00      0.00          20000.0
4    1  138783.20   89969.37 -48813.83          20000.0
5    1    2459.92    2459.92      0.00          20000.0
6    2  101421.99   93387.45  -8034.54           3000.0
7    2     940.04     940.04      0.00           3000.0
8    2      63.06      63.06      0.00           3000.0
9    2    2454.86    2454.86      0.00           3000.0
10   2     830.00     830.00      0.00           3000.0
11   2     299.00     299.00      0.00           3000.0
12   2   14000.00   12000.00   2000.00           3000.0
13   2    1500.00     500.00   1000.00           3000.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading