Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I find all the possible combinations of elements in column vector and cross check them from another dataframe in R?

I have 2 data frames of a single column

data frame A:

var
A
B

and data frame B:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

var
A
C
library(tidyverse) 
A = tibble(var = c("A","B"))
B = tibble(var = c("A","C"))

I want to find all the possible combinations of the elements in the var column in a new data frame that must look like this :

varA varB
A A
A B
A C
B C

Now from this resulted data frame I want to find this combinations in another data frame no matter the position (first or second,varA – varB) of the new data frame and give me the column y with the corresponded values.

varA varB y
B A 1.12
A A 1.14
C A 1.15
B C 1.16
D F 1.23
H G 1.1
D = tibble(varA = c("B","A","C","B","D","H"),
           varB = c("A","A","A","C","F","G"),
           y = c(1.12,1.14,1.15,1.16,1.23,1.10));D

ideally I want to look like this :

varA varB y
A A 1.14
A B 1.12
A C 1.15
B C 1.16

How can I do this in R using dplyr package or base functions ?

>Solution :

Here is my attempt.

library(tidyverse) 
A = tibble(var = c("A","B"))
B = tibble(var = c("A","C"))

# create data frame of all combinations 
B <- expand.grid(A$var, B$var)

D <- tibble(varA = c("B","A","C","B","D","H"),
            varB = c("A","A","A","C","F","G"),
            y = c(1.12,1.14,1.15,1.16,1.23,1.10));D

# create new column and sort 

strSort <- function(x)
  sapply(lapply(strsplit(x, NULL), sort), paste, collapse = "")

D <- D %>% mutate(var_new = strSort(paste(varA, varB)))
B <- B %>% mutate(var_new = strSort(paste(Var1, Var2)))


left_join(B, D) %>% select(Var1, Var2, y)

I use expand.grid to create all possible combination, and then I create a new column for both data frames with the sorted string (sorting function), to make sure the order does not matter, then it is a simple left join and cleaning the resulting data frame.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading