Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

merge two dataframes based on condition and get id

I have two dataframes:

df1

name    year

name_1  2018
name_2  2017
name_3  2021
name_1  2022
.
.
.

and:

df2

name    start    end    id

name_1  2017     2021   1
name_2  2017     2022   2
name_1  2022     2023   3

Now I want to add the id of each name into df1 from df2 based on the following two conditions:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  • they have the same name value,
  • the year: for example the first row in df1 year is 2018 so if we go to df2, we have two name_1, and 2018 falls in the first row of df2 (2017 <= 2018 < 2021); so we return id 1 in that case

How to do that?

>Solution :

You can use an outer merge:

out = (df1.merge(df2, on='name', how='outer')
          .query('(start <= year) & (year <= end)')
          .astype(df1.dtypes.to_dict() | df2.dtypes.to_dict()))
print(out)

# Output
     name  year  start   end  id
0  name_1  2018   2017  2021   1
4  name_2  2017   2017  2022   2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading