I have a single list that looks like this
main.list <- c("dog", "cat", "bird", "snake")
I have a bunch of equal sized comparator elements that share some, but not all, the elements in main.list
comparator.list1 <- c("dog", "cat", "bird", "crescent")
comparator.list2 <- c("dog", "lizard", "cup", "plate")
comparator.list3 <- c("lizard", "bird", "squirrel", "snake")
I want to make a list that consists of the proportion overlapping elements between all the comparator lists, and the main list. So in this case:
List Number.ofshared.elemts
comparator.list1 0.75
comparator.list2 0.25
comparator.list3 0.5
How can I do that?
>Solution :
Get the ‘comparator’ objects in a list, use %in% to return a logical vector by comparing the elements with ‘main.list’, convert to proportion with mean, and stack the key/value pair to a data.frame with two columns
out <- stack(lapply(mget(ls(pattern = 'comparator')),
function(x) mean(main.list %in% x)))[2:1]
names(out) <- c("List", "Number.of.shared.elements")
-output
> out
List Number.of.shared.elements
1 comparator.list1 0.75
2 comparator.list2 0.25
3 comparator.list3 0.50
We may also use intersect with length and divide by the length of the vector
out <- stack(lapply(mget(ls(pattern = 'comparator')),
function(x) length(intersect(main.list, x))/length(x)))[2:1]
names(out) <- c("List", "Number.of.shared.elements")
Or using tidyverse
library(dplyr)
library(tibble)
library(tidyr)
mget(ls(pattern = 'comparator')) %>%
enframe(name = 'List') %>%
unnest(value) %>%
group_by(List) %>%
summarise(Number.of.shared.elements = length(intersect(value,
main.list))/n(), .groups = 'drop')
# A tibble: 3 × 2
List Number.of.shared.elements
<chr> <dbl>
1 comparator.list1 0.75
2 comparator.list2 0.25
3 comparator.list3 0.5