I have a tidyr function (if it can me solved here would be great) which takes a data.table object and sees if a keyword (kw – here it is ‘agree’) is mentioned any times. This returns me a data.table vector of lists with anytime it mentions it:
test <- ptadfmatching[,"text"] %>%
mutate(new_var = str_extract_all(text, regex(kw[x], ignore_case = TRUE)) )%>%
select(new_var)
The result is something like this
> test
new_var
<list>
1: AGREE,Agree,agree
2: Agree,Agree,Agree
3: agree,Agree
4: agree,Agree
5: Agree
6: agree
7: Agree,Agree
8: Agree,Agree,Agree
9: Agree,Agree,agree
10:
Question – how do I get length of each list in ‘test’ (without a loop).
>Solution :
You can try lengths(), which is probably what you wanted. ?lengths shows:
Get the length of each element of a list or atomic vector (is.atomic) as an integer or numeric vector.
A simple example:
test <- data.table::as.data.table(structure(list(new_var = list(c("AGREE", "Agree", "agree"), c("Agree", "Agree", "Agree"), c("agree", "Agree"), c("agree", "Agree"), "Agree", "agree", c("Agree", "Agree"), c("Agree", "Agree", "Agree"), c("Agree", "Agree", "agree"), character(0))), row.names = c(NA, -10L), class = c("data.table", "data.frame")))
# dplyr
test |>
dplyr::mutate(count = lengths(new_var))
#> new_var count
#> 1: AGREE,Agree,agree 3
#> 2: Agree,Agree,Agree 3
#> 3: agree,Agree 2
#> 4: agree,Agree 2
#> 5: Agree 1
#> 6: agree 1
#> 7: Agree,Agree 2
#> 8: Agree,Agree,Agree 3
#> 9: Agree,Agree,agree 3
#> 10: 0
# data.table
test[, count := lengths(new_var)]
test
#> new_var count
#> 1: AGREE,Agree,agree 3
#> 2: Agree,Agree,Agree 3
#> 3: agree,Agree 2
#> 4: agree,Agree 2
#> 5: Agree 1
#> 6: agree 1
#> 7: Agree,Agree 2
#> 8: Agree,Agree,Agree 3
#> 9: Agree,Agree,agree 3
#> 10: 0
Created on 2023-08-01 with reprex v2.0.2