I have 2 dataframes as such:
DF1:
| Name | Age | Score |
|---|---|---|
| A | 20 | 55 |
| B | 21 | 60 |
| C | 22 | 20 |
| D | 23 | 25 |
| E | 24 | 30 |
The second dataframe will contain similar data but with less info. Crucially, all the names in the the second dataframe are in the first one and I am taking a big assumption that all names in df1 are unique.
DF2:
| Name | Age | Score |
|---|---|---|
| A | 20 | 20 |
| D | 23 | 40 |
| E | 24 | 80 |
What I would like to do is go over the second dataframe, find the name that matches in the first dataframe and sum the scores up such that I would end up with the following dataframe:
| Name | Age | Score |
|---|---|---|
| A | 20 | 75 |
| B | 21 | 60 |
| C | 22 | 20 |
| D | 23 | 65 |
| E | 24 | 110 |
>Solution :
Use concat with aggregation – here if all another columns are summed:
out = pd.concat([DF1, DF2]).groupby(['Name','Age'], as_index=False).sum()
print (out)
Name Age Score
0 A 20 75
1 B 21 60
2 C 22 20
3 D 23 65
4 E 24 110
If need specify multiple aggregation functions – here first value per groups by Class and another columns aggregate sum add GroupBy.agg:
d = {'Class':'first', 'Score':'sum', 'Att.':'sum', 'Test':'sum'}
out = pd.concat([DF1, DF2]).groupby(['Name','Age'], as_index=False).agg(d)
print (out)
Name Age Class Score Att. Test
0 A 20 ENG 60 35 55
1 B 19 ENG 45 15 60