Sum values in a list based on the unique values in an other list

July 3, 2023

I have two lists, where one lists contains numeric values and the other contains strings ("A", "B" or "C"). The goal is to sum up the values for every unique string in the second list. I assume the first list is ordered (arbitrary) and the indexes in second list match up.

Example:

list_one = ["A", "A", "B", "B", "C", "C"]
list_two = [1000, 200, 500, 120, 500, 350]

The resulting list should be the sum for each unique string in list_one based on the values in list_two:

res_list = [1200, 620, 850]

I can find the indexes of per unique string in list_one by

np.unique(list_one, return_index=True)[1] = array([0, 2, 4], dtype=int64)

but I don’t know how to go from here.

>Solution :

Assuming that both lists are of the same length you could use a dictionary as an intermediate means of accumulating the relevant data

list_one = ["A", "A", "B", "B", "C", "C"]
list_two = [1000, 200, 500, 120, 500, 350]

r = {}

for a, b in zip(list_one, list_two):
    r[a] = r.get(a, 0) + b

print(list(r.values()))

Output:

[1200, 620, 850]

You could also use a defaultdict for slightly more concise code as follows:

from collections import defaultdict

list_one = ["A", "A", "B", "B", "C", "C"]
list_two = [1000, 200, 500, 120, 500, 350]

r = defaultdict(int)

for a, b in zip(list_one, list_two):
    r[a] += b

print(list(r.values()))