Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Count how many times strings from one data frame appear to another data frame in R dplyr

I have two data frames that look like this:

df1 <- data.frame(reference=c("cat","dog"))
print(df1)
#>   reference
#> 1       cat
#> 2       dog
df2 <- data.frame(data=c("cat","car","catt","cart","dog","dog","pitbull"))
print(df2)
#>      data
#> 1     cat
#> 2     car
#> 3    catt
#> 4    cart
#> 5     dog
#> 6     dog
#> 7 pitbull

Created on 2021-12-29 by the reprex package (v2.0.1)

I want to find how many times the words cat and dog from the df1 exist in df2.
I want my data to look like this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

animals   n
cat       1
dog       2

Any help or guidance is appreciated. My reference list is huge. I tried to grep each one of them but ll take me time.

Thank you for your time. Happy holidays

>Solution :

A possible solution, tidyverse-based:

library(tidyverse)

df1 <- data.frame(reference=c("cat","dog"))
df2 <- data.frame(data=c("cat","car","catt","cart","dog","dog","pitbull"))

df1 %>% 
  group_by(animal = reference) %>% 
  summarise(n = sum(reference == df2$data), .groups = "drop")

#> # A tibble: 2 × 2
#>   animal     n
#>   <chr>  <int>
#> 1 cat        1
#> 2 dog        2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading