Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Retain original number of rows with dataset to be matched, when pairing values from two different datasets in Pandas

Data

df1

ID                  stat
AA1                 exzone
BB2                 exzone5
CC4                 limit5

df2

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

name                state
AA1                 NY
AA1                 NY
AA1                 NY
AA1                 NY
BB2                 GA
BB2                 GA
BB2                 GA
CC4                 CA
CC4                 CA

Desired

name                stat          state
AA1                 exzone        NY
BB2                 exzone5       GA
CC4                 limit5        CA

Doing

out = pd.merge(df1,df2, left_on=['ID'], right_on= ['name'], how="left")

however, the above script is giving an exploded output and does not retain the original Left dataframe row count. Any suggestion is appreciated.

>Solution :

A left merge doesn’t mean that the structure will be identical to that of the original left DataFrame. It means that all the left keys will be preserved, even if absent from the right DataFrame. In your case the duplicated keys on the right force the merge to compute all combinations of the rows.

You need to first remove the duplicates:

out = pd.merge(df1, df2.drop_duplicates(), left_on=['ID'], right_on= ['name'], how="left")

If for some reason you have several different states per name, you should find another way to aggregate (pick the first, last, combine the unique states as a single string, etc.), or accept to have row duplications.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading