Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I sum all values in one column based on values from range of another column

Combi Tons width bruttotime time time in minutes total time for Combi
300244871 23.080 1.544,00 69,98 05:12:30 70,05
0,000 1.544,00 0,00 06:22:33 0,00
22.790 1.544,00 71,98 06:22:33 71,98
300244871 0,000 1.544,00 0,00 07:34:32 0,00
300244881 23.080 1.544,00 69,98 05:12:30 77,05
0,000 1.544,00 0,00 06:22:33 0,00
22.790 1.544,00 71,98 06:22:33 5,98
300244881 0,000 1.544,00 0,00 07:34:325,98 0,00

How do I sum all values in last column (time in minutes) based on values from the Combi column (from 300244871 – to 300244871) Total time for Combi should be 142,03 for 300244871.

I have tried with below code.

df['time in minutes'] = pd.to_numeric(df['time in minutes'].str.replace(',', '.'))
start_value = df['Combi'].min()
end_value = df['Combi'].max()
filtered_df = df[(df['Combi'] >= start_value) & (df['Combi'] <= end_value)]
sum_result = filtered_df['time in minutes'].sum()
print(f"Sum of values for {start_value} to {end_value} in 'time in minutes': {sum_result}")

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use a custom groupby.agg. Assuming NaNs in the empty cell, create the grouper with ffill:

df['time in minutes'] = pd.to_numeric(df['time in minutes'].str.replace(',', '.'))
out = (df.groupby(df['Combi'].ffill())
         .agg(**{'start': ('time', 'min'),
                 'end': ('time', 'max'),
                 'sum': ('time in minutes', 'sum')
                })
      )

NB. If the empty cells are empty strings, use df['Combi'].replace('', np.nan).ffill() as grouper.

Output:

                start       end     sum
Combi                                     
300244871.0  05:12:30  07:34:32  142.03
300244881.0  05:12:30  07:34:32   83.03

Using a loop:

for k, g in df.groupby(df['Combi'].ffill()):
    start_value = g['time'].min()
    end_value = g['time'].max()
    sum_result = g['time in minutes'].sum()
    print(f"Sum of values for {start_value} to {end_value} in 'time in minutes': {sum_result}")

Output:

Sum of values for 05:12:30 to 07:34:32 in 'time in minutes': 142.03
Sum of values for 05:12:30 to 07:34:32 in 'time in minutes': 83.03
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading