Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Copy values to another dataframe from row based on condition

In my dataframe (df), I need to:

  • For all rows where the value in ‘Column A’ == Yes
  • The value in ‘Column B’ should be:
    • The value of ‘Column E’ in dataframe ‘df2’ of the row where ‘df2’ ‘Column D’ == ‘df’ ‘Column C’

Currently I have this:

df.loc[df['Column A'] == 'Yes', 'Column B'] = df2.loc[(df2['Column D'] == df['Column C']), 'Column E']

However this results in:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ValueError: Can only compare identically-labeled Series objects

Example of my dataframes:

df:
Column A   Column B   Column C
Yes                   2554072
No                    2557766
No                    261191
Yes                   2600355
Yes                   2601388
...                   ...

df2:
Column D  Column E
2554072   8
2601388   4
2604142   6
...       

Required results:
df:
Column A   Column B   Column C
Yes        8          2554072
No                    2557766
No                    261191
Yes        NoMatch    2600355
Yes        4          2601388
...                   ...

I have tried to change the values in all of these columns to string and fill NaN values in Column C, however the ValueError is persistent. Any suggestions?

>Solution :

You can map and mask with where:

df['Column B'] = (df['Column C']
                  .map(df2.set_index('Column D')['Column E'])
                  .where(df['Column A'].eq('Yes'))
                 )

Or, with boolean indexing:

m = df['Column A'].eq('Yes')
df.loc[m, 'Column B'] =  (df.loc[m, 'Column C']
                          .map(df2.set_index('Column D')['Column E'])
                         )

Example:

# df
  Column A Column C
0      Yes        A
1       No        B
2      Yes        C
3       No        D
4      Yes        E
5       No        F

# df2
  Column D  Column E
0        D         0
1        E         1
2        F         2
3        A         3
4        B         4
5        C         5

# Output
  Column A Column C  Column B
0      Yes        A       3.0
1       No        B       NaN
2      Yes        C       5.0
3       No        D       NaN
4      Yes        E       1.0
5       No        F       NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading