let’s say I have a ‘base’ DF with encoded columns and many mapping tables – each of them includes a dictionary for one encoded column from the base df, for example
Base df: df
RETAILER STATE_CODE SIZE_CODE POTENTIAL_CODE
store1 1 008 222
store2 2 001 111
store3 3 006 333
Mapping table 1: mp_df1
STATE_CODE STATE_NAME
1 California
2 Nevada
3 Washington
Mapping table 2: mp_df2
SIZE_CODE SIZE_NAME
001 Small
006 Moderate
008 Big
Mapping table 3: mp_df3
POTENTIAL_CODE POTENTIAL_NAME
111 Small
222 Moderate
333 Promising
So the base table has a seperate key for each of the mapping tables.
In this case I can use 3 merges but in real life I have more than just 3 mapping tables so it’s really annoying to do it ‘manually’. Is there a simpler way to combine these dfs into one?
RETAILER STATE_CODE SIZE_CODE POTENTIAL_CODE STATE_NAME SIZE_NAME POTENTIAL_NAME
store1 1 008 222 California Big Moderate
store2 2 001 111 Nevada Small Small
store3 3 006 333 Washington Moderate Promising
>Solution :
You can use pd.concat and map:
out = pd.concat([df,
df['STATE_CODE'].map(mp_df1.set_index('STATE_CODE').squeeze()),
df['SIZE_CODE'].map(mp_df2.set_index('SIZE_CODE').squeeze()),
df['POTENTIAL_CODE'].map(mp_df3.set_index('POTENTIAL_CODE').squeeze())], axis=1)
print(out)
# Output:
RETAILER STATE_CODE SIZE_CODE POTENTIAL_CODE STATE_CODE SIZE_CODE \
0 store1 1 008 222 California Big
1 store2 2 001 111 Nevada Small
2 store3 3 006 333 Washington Moderate
POTENTIAL_CODE
0 Moderate
1 Small
2 Promising