Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

data.table: find number of times unique id matches element in vector

I am looking for a data.table solution for this problem. I have data like this:

library(data.table)

codes1 <- c("A1", "A2", "A3")
codes2 <- c("B1", "B2", "B3")
codes3 <- c("C1", "C2", "C3")

data <- data.table(
  id = c(1,1,2,3,3,4,4,4),
  code = c("A1","A3", "B1", "A2", "B2","A1","B2","C1")
)

I wish to count, for each unique id, number of times data$code matches an element in vectors codes1,codes2, and codes3, counting only once for a match in each vector. I wish to end up with the following:

data_want <- data.table(
  id = c(1,2,3,4),
  match = c(1,1,2,3)
)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Place the codes vectors in a list, loop over the list with lapply, after grouping by ‘id’, then check whether any of the elements are %in% the ‘code’ column, Reduce the list of logical vectors to integer by adding (+TRUE -> 1 and FALSE -> 0)

library(data.table)
data[, .(match = Reduce(`+`, lapply(list(codes1, codes2, codes3), 
    \(x) any(x %in% code)))), by =  id]

-output

      id match
   <num> <int>
1:     1     1
2:     2     1
3:     3     2
4:     4     3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading