Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Choose values from a dataframe based on values in a second data frame without looping

The following minimal example produces the desired output, but I really want to do this without looping through the dataframes, since in reality I work with much larger dataframes an this takes quite a lot of time.

import pandas as pd

data1 = {'SCH_NUMMER': [2, 9, 8, 5], 'h': [9, 8, 7, 6]}
df1 = pd.DataFrame(data=data1)

data2 = {'HAL_HALTNU': [1, 2, 3, 4], 'HAL_VONSCH': [9, 5, 8, 2]}
df2 = pd.DataFrame(data=data2)

for n in df2['HAL_HALTNU']:
    vonsch = df2.loc[df2['HAL_HALTNU'] == n, 'HAL_VONSCH'].iloc[0]
    h1 = df1.loc[df1['SCH_NUMMER'] == vonsch, 'h'].iloc[0]
    print(f"HAL_HALTNU: {n}, h1 = {h1}")

I have two dataframes, in which two columns, namely df1['SCH_NUMMER'] and df2['HAL_VONSCH'], contain the same values, but not in the same order.
I now want to assign to every number in df1['HAL_HALTNU'] (every row in the dataframe) the corresponding value from df1['h'] in a way that for example the value h = 8, corresponding to df1['SCH_NUMMER'] == 9, is assigned to the row where df2['HAL_VONSCH'] == 9 i.e. to df2['HAL_HALTNU'] == 1. So I need the values from the colums ‘SCH_NUMMER’ and ‘HAL_VONSCH’ to ‘match’ in a way.

Here is the desired output

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

HAL_HALTNU: 1, h1 = 8
HAL_HALTNU: 2, h1 = 6
HAL_HALTNU: 3, h1 = 7
HAL_HALTNU: 4, h1 = 9

>Solution :

Use merge:

df2['h1'] = df2.merge(df1, left_on='HAL_VONSCH', right_on='SCH_NUMMER', how='left')['h']
print(df2)

# Output
   HAL_HALTNU  HAL_VONSCH  h1
0           1           9   8
1           2           5   6
2           3           8   7
3           4           2   9

Or map:

df2['h1'] = df2['HAL_VONSCH'].map(df1.set_index('SCH_NUMMER')['h'])
print(df2)

# Output
   HAL_HALTNU  HAL_VONSCH  h1
0           1           9   8
1           2           5   6
2           3           8   7
3           4           2   9
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading