R data.table and dplyr – count number of elements in each list

August 1, 2023

I have a tidyr function (if it can me solved here would be great) which takes a data.table object and sees if a keyword (kw – here it is ‘agree’) is mentioned any times. This returns me a data.table vector of lists with anytime it mentions it:

    test <-  ptadfmatching[,"text"] %>% 
      mutate(new_var =   str_extract_all(text, regex(kw[x], ignore_case = TRUE))   )%>% 
      select(new_var)

The result is something like this

> test
                         new_var
                           <list>
 1:             AGREE,Agree,agree
 2:             Agree,Agree,Agree
 3:                   agree,Agree
 4:                   agree,Agree
 5:                         Agree
 6:                         agree
 7:                   Agree,Agree
 8:             Agree,Agree,Agree
 9:             Agree,Agree,agree
10:

Question – how do I get length of each list in ‘test’ (without a loop).

>Solution :

You can try lengths(), which is probably what you wanted. ?lengths shows:

Get the length of each element of a list or atomic vector (is.atomic) as an integer or numeric vector.

A simple example:

test <- data.table::as.data.table(structure(list(new_var = list(c("AGREE", "Agree", "agree"), c("Agree", "Agree", "Agree"), c("agree", "Agree"), c("agree", "Agree"), "Agree", "agree", c("Agree", "Agree"), c("Agree", "Agree", "Agree"), c("Agree", "Agree", "agree"), character(0))), row.names = c(NA, -10L), class = c("data.table", "data.frame")))
# dplyr
test |> 
  dplyr::mutate(count = lengths(new_var))
#>               new_var count
#>  1: AGREE,Agree,agree     3
#>  2: Agree,Agree,Agree     3
#>  3:       agree,Agree     2
#>  4:       agree,Agree     2
#>  5:             Agree     1
#>  6:             agree     1
#>  7:       Agree,Agree     2
#>  8: Agree,Agree,Agree     3
#>  9: Agree,Agree,agree     3
#> 10:                       0
# data.table
test[, count := lengths(new_var)]
test
#>               new_var count
#>  1: AGREE,Agree,agree     3
#>  2: Agree,Agree,Agree     3
#>  3:       agree,Agree     2
#>  4:       agree,Agree     2
#>  5:             Agree     1
#>  6:             agree     1
#>  7:       Agree,Agree     2
#>  8: Agree,Agree,Agree     3
#>  9: Agree,Agree,agree     3
#> 10:                       0

^{Created on 2023-08-01 with reprex v2.0.2}