Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find common values of two lists in R data table

I have a data table test:

id array1 array2
1 c(1, 2, 3, 4, 5) c(3, 4, 5)
2 c(6, 7, 8, 9, 10) c(6, 7, 0)
> str(test)
Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
 $ id    : num  1 2
 $ array1:List of 2
  ..$ : num  1 2 3 4 5
  ..$ : num  6 7 8 9 10
 $ array2:List of 2
  ..$ : num  3 4 5
  ..$ : num  6 7 0
 - attr(*, ".internal.selfref")=<externalptr> 

For each row I want to find values which are common for array1 and array2 as well as to find values that are in array1 but not in array2.

I tried using setdiff() and intersect() but got incorrect results:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

test[, `:=` (diff = setdiff(array1, array2),
             common = intersect(array1, array2))]
id array1 array2 diff common
1 c(1, 2, 3, 4, 5) c(3, 4, 5) c(1, 2, 3, 4, 5) NULL
2 c(6, 7, 8, 9, 10) c(6, 7, 0) c(6, 7, 8, 9, 10) NULL

Expected output:

id array1 array2 diff common
1 c(1, 2, 3, 4, 5) c(3, 4, 5) c(1, 2) c(3, 4, 5)
2 c(6, 7, 8, 9, 10) c(6, 7, 0) c(8, 9, 10) c(6, 7)

Will be grateful for any help!

>Solution :

As these are list, use Map to loop over the list elements and apply the functions

library(data.table)
test[, c("diff", "common") := list(Map(setdiff, array1, array2), 
      Map(intersect, array1, array2))]

-output

> test
      id         array1 array2     diff common
   <int>         <list> <list>   <list> <list>
1:     1      1,2,3,4,5  3,4,5      1,2  3,4,5
2:     2  6, 7, 8, 9,10  6,7,0  8, 9,10    6,7

Or using a single Map with transpose

test[, c("diff", "common") := transpose(Map(function(x, y) 
      list(setdiff(x, y), intersect(x, y)), array1, array2))]

Or group by ‘id’ (assuming no duplicate ‘id’s) and extract ([[1]]) the first element

test[, c('diff', 'common') := .(.(setdiff(array1[[1]], 
        array2[[1]])), .(intersect(array1[[1]], array2[[1]]))), id]

data

test <- structure(list(id = 1:2, array1 = list(1:5, 6:10), 
   array2 = list(
    3:5, c(6L, 7L, 0L))), row.names = c(NA, -2L), class = c("data.table", 
"data.frame"))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading