Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas: How to easily merge a dataframe with many other dataframes on different keys?

let’s say I have a ‘base’ DF with encoded columns and many mapping tables – each of them includes a dictionary for one encoded column from the base df, for example

Base df: df

RETAILER    STATE_CODE  SIZE_CODE   POTENTIAL_CODE
store1      1           008         222
store2      2           001         111
store3      3           006         333

Mapping table 1: mp_df1

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

STATE_CODE STATE_NAME
1          California
2          Nevada
3          Washington

Mapping table 2: mp_df2

SIZE_CODE SIZE_NAME
001       Small
006       Moderate
008       Big

Mapping table 3: mp_df3

POTENTIAL_CODE POTENTIAL_NAME
111            Small
222            Moderate
333            Promising

So the base table has a seperate key for each of the mapping tables.

In this case I can use 3 merges but in real life I have more than just 3 mapping tables so it’s really annoying to do it ‘manually’. Is there a simpler way to combine these dfs into one?

RETAILER    STATE_CODE  SIZE_CODE   POTENTIAL_CODE STATE_NAME SIZE_NAME POTENTIAL_NAME
store1      1           008         222            California Big       Moderate  
store2      2           001         111            Nevada     Small     Small
store3      3           006         333            Washington Moderate  Promising

>Solution :

You can use pd.concat and map:

out = pd.concat([df,
    df['STATE_CODE'].map(mp_df1.set_index('STATE_CODE').squeeze()),
    df['SIZE_CODE'].map(mp_df2.set_index('SIZE_CODE').squeeze()),
    df['POTENTIAL_CODE'].map(mp_df3.set_index('POTENTIAL_CODE').squeeze())], axis=1)
print(out)

# Output:
  RETAILER STATE_CODE SIZE_CODE POTENTIAL_CODE  STATE_CODE SIZE_CODE  \
0   store1          1       008            222  California       Big   
1   store2          2       001            111      Nevada     Small   
2   store3          3       006            333  Washington  Moderate   

  POTENTIAL_CODE  
0       Moderate  
1          Small  
2      Promising  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading