Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to calculate the total sum except one row in a dataframe?

I want to calculate the total sum of all parameter in a dataframe except one named Production-avg in the below python code:

code:

import pandas as pd

# Example DataFrame
df = pd.DataFrame({'category': ['Production', 'Production', 'Production', 'Production', 'Production'], 'parameter': ['hello', 'Production-avg', 'Production-etf', 'ktf', 'kfc'], 'actuals': [10, 30, 50, 70, 90], 'bp': [15, 35, 55, 75, 95]})
df.loc['total'] = df.select_dtypes(include=['int64']).sum()
print(df)

current output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

         category       parameter  actuals     bp
0      Production           hello     10.0   15.0
1      Production  Production-avg     30.0   35.0
2      Production  Production-etf     50.0   55.0
3      Production             ktf     70.0   75.0
4      Production             kfc     90.0   95.0
total         NaN             NaN    250.0  275.0

Expected output:

           category       parameter  actuals     bp
 0      Production           hello     10.0   15.0
 1      Production  Production-avg     30.0   35.0
 2      Production  Production-etf     50.0   55.0
 3      Production             ktf     70.0   75.0
 4      Production             kfc     90.0   95.0
 total         NaN             NaN    220.0  240.0

Can anyone suggest a solution to get the expected output?

>Solution :

You can use boolean indexing to filter out rows where parameter column value is Production-avg

df.loc['total'] = (df.select_dtypes('number').drop('total')
                   [df['parameter'] != 'Production-avg'].sum())
# or
df.loc['total'] = (df.query('parameter != "Production-avg"')
                   .select_dtypes('number').drop('total').sum())

As mozway suggests, some modification in your original code:

  1. Use .select_dtypes('number') to select numeric columns to include int and float type.
  2. Use .drop('total') to drop the existing total row to avoid summing that row to new result.
print(df)

         category       parameter  actuals     bp
0      Production           hello     10.0   15.0
1      Production  Production-avg     30.0   35.0
2      Production  Production-etf     50.0   55.0
3      Production             ktf     70.0   75.0
4      Production             kfc     90.0   95.0
total         NaN             NaN    220.0  240.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading