So I have two tables I’d like to merge. However, the column in the second table has references to two columns of the first table, e.g.:
Table 1 python:
table1 = pd.DataFrame({
'Key1': ['Value1', 'Value2', 'Value3'],
'Key2': ['Apples', 'Pears', 'Oranges'],
'Text': ['Hello', 'World', '!'],
'Values': [5, 8, 12],
})
Table 1:
ββββββββββ¦ββββββββββ¦ββββββββ¦βββββββββ
β Key1 β Key2 β Text β Values β
β βββββββββ¬ββββββββββ¬ββββββββ¬βββββββββ£
β Value1 β Apples β Hello β 5 β
β Value2 β Pears β World β 8 β
β Value3 β Oranges β ! β 12 β
ββββββββββ©ββββββββββ©ββββββββ©βββββββββ
Table 2 python:
table2 = pd.DataFrame({'Key1_Key2': ['Value2', 'Apples', 'Oranges'],
'Value_to_Extract': ['Value 123', 'Value 88',
'Value 99']})
Table 2:
βββββββββββββ¦βββββββββββββββββββ
β Key1_Key2 β Value_to_Extract β
β ββββββββββββ¬βββββββββββββββββββ£
β Value2 β Value 123 β
β Apples β Value 88 β
β Oranges β Value 99 β
βββββββββββββ©βββββββββββββββββββ
I first would like to merge based on Key1 values, and only if that fails, then on Key2 values. It should be an exact match, so I am not looking to concatenate Key1 and Key2 and then search for a substring because then if there are values that look alike (e.g. ‘Apples’ and ‘Apple’) then it will wreak havoc.
Result should look like this:
ββββββββββ¦ββββββββββ¦ββββββββ¦βββββββββ¦βββββββββββββββββββ
β Key1 β Key2 β Text β Values β Value_To_Extract β
β βββββββββ¬ββββββββββ¬ββββββββ¬βββββββββ¬βββββββββββββββββββ£
β Value1 β Apples β Hello β 5 β Value 88 β
β Value2 β Pears β World β 8 β Value 123 β
β Value3 β Oranges β ! β 12 β Value 99 β
ββββββββββ©ββββββββββ©ββββββββ©βββββββββ©βββββββββββββββββββ
>Solution :
You should perform the two merge independently, then combine_first in order of preference to fill the missing values only where the first merge failed:
m1 = table1.merge(table2, left_on='Key1', right_on='Key1_Key2', how='left')
m2 = table1.merge(table2, left_on='Key2', right_on='Key1_Key2', how='left')
# uncomment to remove merge column
m1.combine_first(m2)#.drop(columns='Key1_Key2')
output:
Key1 Key2 Text Values Key1_Key2 Value_to_Extract
0 Value1 Apples Hello 5 Apples Value 88
1 Value2 Pears World 8 Value2 Value 123
2 Value3 Oranges ! 12 Oranges Value 99`