I have 2 data frames I want to match some data from one data frame and append it on another.
df1 looks like this:
| sourceId | firstName | lastName |
|---|---|---|
| 1234 | John | Doe |
| 5678 | Sally | Green |
| 9101 | Chlodovech | Anderson |
df2 looks like this:
| sourceId | agentId |
|---|---|
| 123456789 | 1234,5678 |
| 987654321 | 9101 |
| 143216546 | 1234,5678 |
I want my Final Data Frame to look like this:
| sourceId | firstName | lastName | agentId |
|---|---|---|---|
| 1234 | John | Doe | 123456789,143216546 |
| 5678 | Sally | Green | 123456789,143216546 |
| 9101 | Chlodovech | Anderson | 987654321 |
Usually appending stuff is easy but I’m not quite sure how to match this data up, and then append the matches with commas in-between them. I’m fairly new to using pandas so any help is appreciated.
>Solution :
This works. It’s long and not the most elegant, but it works well 🙂
tmp = df2.assign(agentId=df2['agentId'].str.split(',')).explode('agentId').set_index('agentId')['sourceId'].astype(str).groupby(level=0).agg(list).str.join(',').reset_index()
df1['sourceId'] = df1['sourceId'].astype(str)
new_df = df1.merge(tmp, left_on='sourceId', right_on='agentId').drop('agentId',axis=1).rename({'sourceId_x':'sourceId', 'sourceId_y':'agentId'},axis=1)
Output:
>>> new_df
sourceId firstName lastName agentId
0 1234 John Doe 123456789,143216546
1 5678 Sally Green 123456789,143216546
2 9101 Chlodovech Anderson 987654321