Python: sum column for every dataframe in a list

I have a list of identical dataframes and I am trying to sum one column in each dataframe in the list. My thought is something like total = [df['A'].sum for df in dfs] but this returns a list of length dfs containing only the value method. My desired output is a list of the column sum for each dataframe. What is the fastest way to achieve this goal? I have to repeat this sum thousands of times per list on thousands of different lists.

>Solution :

Perhaps, you are missing () after sum

 total = [df['A'].sum() for df in dfs]

You want to call the method sum not just reference it.

Python sum is pretty quick: Python built-in sum function vs. for loop performance and
I assume that pandas sum should be comparable.
Difference between sum, 'sum' and np.sum *under the hood* (Python / Pandas / Numpy)

Leave a Reply