I have two data frames.
DF1
isActive,trackedSearchId
True,53436615
True,53434228
True,53434229
DF2
trackedSearchIds,Primary Keyword Group(s)
"[53436613, 53436615, 53437436, 53436506]",SEO - Directory-Deployment
"[53435887, 53437509, 53437441, 53436615, 53438685, 53437636]",SEO - Other-Glossary
"[53437504, 53435090, 53435887, 53434228]",SEO - Other
"[53437504, 53435090, 53434229]",SEO - Glossary
I want to check for each row of DF1 for column trackedSearchId and check with each row in DF2 with each value of trackedSearchIds array and if the value of DF1 column is present in DF2 append it with DF1.
The output should look like:
isActive,trackedSearchId,Primary Keyword Group(s)
True,53436615,SEO - Directory-Deployment&SEO - Other-Glossary
True,53434228,SEO - Other
True,53434229,SEO - Glossary
>Solution :
Assuming strings representations of lists in df2, use ast.literal_eval, then groupby.agg, explode, and merge:
import ast
df1['trackedSearchId'] = df1['trackedSearchId'].astype(str)
out = df1.merge((df2.assign(trackedSearchIds=df2['trackedSearchIds'].str.findall(r'(\d+)')).explode('trackedSearchIds')
.groupby('trackedSearchIds').agg('&'.join)
),
left_on='trackedSearchId', right_on='trackedSearchIds')
Output:
isActive trackedSearchId Primary Keyword Group(s)
0 True 53436615 SEO - Directory-Deployment&SEO - Other-Glossary
1 True 53434228 SEO - Other
2 True 53434229 SEO - Glossary