Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Replace NAs based on another two data frames across multiple columns

I would like to replace NAs of df1 by df2(both of two data frames are large with about 155 columns x 3966 rows)
Both of two data frames are with the first column ‘ID’ and other column names are overlapped.
Here is a part of example:

> df1
       ID col1 col2  ...... col154
1  AMM115    C    A  ......     A+
2  ADM107   NA   NA  ......      B
3  AGM041    B    C  ......     C+
4  AGM132    A   NA  ......     A+
5  AQM007   NA    A  ......     B+
6  ARM028   NA   B+  ......     A-
7  ASM019    A   A+  ......     NA

> df2
       ID col1 col2  ...... col154
1  ADM107   A+    B  ......      B
2  AGM041    C    A  ......     B+
3  ARM028   A+   B+  ......     NA
4  AQM007   B+    A  ......     B+

I try this answers here :
Can I replace NAs when joining two data frames with dplyr?
I know coalesce might be I needed. It can fills the NA from the first vector but how could I across all columns.
but I can’t figure out how to mutate and coalesce multiple columns like my real data expect first column.
Also select suffix .x or .y at this code.

library(dplyr)
df1 %>% 
        left_join(df2, by = "fruit") %>% 
        mutate(var2 = coalesce(var2.x, var2.y)) %>% 
        select(-var2.x, -var2.y)

I want get like df3, if NAs of df1 and df2 have value , using df2 to replace NAs of df1,if not, keep df1 original values.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

> df3
       ID col1 col2  ...... col154
1  AMM115    C    A  ......     A+
2  ADM107   A+    B  ......      B
3  AGM041    B    C  ......     C+
4  AGM132    A   NA  ......     A+
5  AQM007   B+    A  ......     B+
6  ARM028   A+   B+  ......     A-
7  ASM019    A   A+  ......     NA

>Solution :

You can do:

library(dplyr)
df3 <- df1 %>%
  rows_patch(df2, by = 'ID')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading