Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Trying to make new column of values from second dataframe based on values in two other columns

I have two dataframes, df1 has columns with month and year associated with the data and df2 has month (in numbers) as the headers and years as the index values.

I’m then trying to populate a new column in df1 with the appropriate values that correspond to the month/year from df2. I have tried .loc function but not sure if it’s meant to populate a whole column or just return one value at a time.

df1

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

other data month year
xyz 12 1966
xyz 1 1997

df2

index 1 2 3 4 5 …. 12
1929 x y z x y …. z
1930 x y z x y …. z
x y z x y …. z
1966 x y z x y …. z
1997 x y z x y …. z

and I want a new column to be added to df1 like this, based on values from df2:

other data month year df2_value
xyz 12 1966 z
xyz 1 1997 x

so far I have been trying this:

df1['df2_value'] = df2.loc[df1['year'],df2['month']]

but I’m getting this key error:

KeyError: "None of [Int64Index([12,  1,  2,  3,  2,  2,  3,  2,  4,  1,  1,  2,  3,  2,  1,  2,  2,\n
2,  2,  2, 12,  3,  1,  2, 12,  1,  2, 11,  3,  1,  2,  1,  3, 12,\n             
4,  3,  2,  1,  3,  2, 11, 12, 10, 12,  2,  4,  3,  1,  4,  1,  1,\n             
2,  3,  1,  2,  4,  2,  2,  2,  4,  2,  3, 12,  9, 12,  3,  2,  3,\n             
1,  2,  3, 11, 11,  4],\n           dtype='int64')] are in the [columns]"

I have changed the month and year columns in df1 to object type instead of integer but that didn’t change the error. This is my first time trying to use .loc so could be missing something very obvious or maybe I need to use an entirely different function?

>Solution :

Just stack df2, reset the index and merge

df1.merge(df2.stack().reset_index(),
          left_on=['year', 'month'],
          right_on=['index', 'level_1'])


  other data month  year index level_1  0
0        xyz    12  1966  1966      12  z
1        xyz     1  1997  1997       1  x
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading