Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to merge two data frmaes with no common column names using R

I have two data frames as below:

df1:

col1 <- c("Snhg6", "Mt1")
col2 <- c("Lpar6", "Nufip2")
col3 <- c("Lasp1", "Pipox")
col4 <- c("Bcam", "Rhot2")
df1 <- data.frame(col1, col2, col3, col4)

#    col1   col2  col3  col4
# 1 Snhg6  Lpar6 Lasp1  Bcam
# 2   Mt1 Nufip2 Pipox Rhot2

df2:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

col10 <- c("Bcam", "Rhot2", "Lpar6" , "Snhg6")
df2 <- data.frame(col10)

#   col10
# 1  Bcam
# 2 Rhot2
# 3 Lpar6
# 4 Snhg6

I want to merge them in a way to find how many of the items in each column of df1 is matching with the items of the df2. I would like to have an output as:

output_df:

#  col1  col2 col3  col4
# Snhg6 Lpar6   NA  Bcam
#    NA    NA   NA Rhot2

I tried to use the methods below but non of them gave the expected results.

library(dplyr)
output_df <- df1 %>% full_join(df2)

and

library(tidyr)
output_df <- crossing(df1,df2)

How can I get the desired output? Thanks.

>Solution :

  • base
res <- df1
res[array(!unlist(res) %in% df2$col10, dim(res))] <- NA
res

#    col1  col2 col3  col4
# 1 Snhg6 Lpar6 <NA>  Bcam
# 2  <NA>  <NA> <NA> Rhot2
  • dplyr
library(dplyr)

df1 %>%
  mutate(across(everything(), ~ replace(.x, !.x %in% df2$col10, NA)))

#    col1  col2 col3  col4
# 1 Snhg6 Lpar6 <NA>  Bcam
# 2  <NA>  <NA> <NA> Rhot2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading