Check for matching columns and sum values over

March 17, 2023

I have 2 dataframes as such:

DF1:

Name	Age	Score
A	20	55
B	21	60
C	22	20
D	23	25
E	24	30

The second dataframe will contain similar data but with less info. Crucially, all the names in the the second dataframe are in the first one and I am taking a big assumption that all names in df1 are unique.

DF2:

Name	Age	Score
A	20	20
D	23	40
E	24	80

What I would like to do is go over the second dataframe, find the name that matches in the first dataframe and sum the scores up such that I would end up with the following dataframe:

Name	Age	Score
A	20	75
B	21	60
C	22	20
D	23	65
E	24	110

>Solution :

Use concat with aggregation – here if all another columns are summed:

out = pd.concat([DF1, DF2]).groupby(['Name','Age'], as_index=False).sum()
print (out)
  Name  Age  Score
0    A   20     75
1    B   21     60
2    C   22     20
3    D   23     65
4    E   24    110

If need specify multiple aggregation functions – here first value per groups by Class and another columns aggregate sum add GroupBy.agg:

d = {'Class':'first', 'Score':'sum', 'Att.':'sum', 'Test':'sum'}
out = pd.concat([DF1, DF2]).groupby(['Name','Age'], as_index=False).agg(d)
print (out)

  Name  Age Class  Score  Att.  Test
0    A   20   ENG     60    35    55
1    B   19   ENG     45    15    60