Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to merge three dataframes to avoid NA values (R)?

I have a long dataset with responses at time 1 and time 2.

I am trying to join two other dataframes (df1 and df2) to my primary dataframe (df) to create df_join. I’ve tried the following code, but it results in four separate columns: var1.x, var2.x, var1.y, var2.y where I would like just var1 and var2

What is the best approach here?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(dplyr)
df <- tibble(ID = c(1001, 1001, 1002, 1002), time = c(1, 2, 1, 2), var3 = c("324908", "1231", "23532", "09094")
df1 <- tibble(ID = c(1001, 1002), time = c(1, 1), var1 = c(300, 400), var2 = c(300, 400))
df2 <- tibble(ID = c(1001, 1002), time = c(2, 2), var1 = c(310, 410), var2 = c(320, 420))

df_join <- df %>% 
  left_join(select(df1, ID, var1, var2), by = c("ID")) %>% 
  left_join(select(df2, ID, var1, var2), by = c("ID"))

My desired output would be:

df_join_desire <- tibble(ID = c(1001, 1001, 1002, 1002), time = c(1, 2, 1, 2), var1 = c(300, 310, 400, 410), var2 = c(300, 320, 400, 420), var3 = c(324908, 1231, 23532, 09094))

>Solution :

You’re encountering this error because df1 and df2 have the same column names, so you don’t need to left_join them together, you can simply bind them (i.e., putting the new rows of df2 at the bottom of df1).

df_join <- left_join(df, bind_rows(df1, df2))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading