I have 2 dataframes as such:
DF1:
Name | Age | Score |
---|---|---|
A | 20 | 55 |
B | 21 | 60 |
C | 22 | 20 |
D | 23 | 25 |
E | 24 | 30 |
The second dataframe will contain similar data but with less info. Crucially, all the names in the the second dataframe are in the first one and I am taking a big assumption that all names in df1 are unique.
DF2:
Name | Age | Score |
---|---|---|
A | 20 | 20 |
D | 23 | 40 |
E | 24 | 80 |
What I would like to do is go over the second dataframe, find the name that matches in the first dataframe and sum the scores up such that I would end up with the following dataframe:
Name | Age | Score |
---|---|---|
A | 20 | 75 |
B | 21 | 60 |
C | 22 | 20 |
D | 23 | 65 |
E | 24 | 110 |
>Solution :
Use concat
with aggregation – here if all another columns are summed:
out = pd.concat([DF1, DF2]).groupby(['Name','Age'], as_index=False).sum()
print (out)
Name Age Score
0 A 20 75
1 B 21 60
2 C 22 20
3 D 23 65
4 E 24 110
If need specify multiple aggregation functions – here first value per groups by Class
and another columns aggregate sum
add GroupBy.agg
:
d = {'Class':'first', 'Score':'sum', 'Att.':'sum', 'Test':'sum'}
out = pd.concat([DF1, DF2]).groupby(['Name','Age'], as_index=False).agg(d)
print (out)
Name Age Class Score Att. Test
0 A 20 ENG 60 35 55
1 B 19 ENG 45 15 60