Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Taking a subset of a main dataset based on the values of another data frame that is a subset of the main data frame

I have these two datasets : df as the main data frame and g as a created data frame

df = data.frame(x = seq(1,20,2),y = letters[1:10] )
df

g = data.frame(xx = c(2,3,4,5,7,8,9) )

and I want to take a subset of the data frame df based on the values xx of the data frame g as follows

m = df[df$x==g$xx,]

but the result is based on the match between the two data frames for the order of the matched values. not the matched values themselves.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

output

> m
  x y
2 3 b

I don’t what the error I am making.

>Solution :

Maybe you need to use %in% instead of ==

> df[df$x %in% g$xx,]
  x y
2 3 b
3 5 c
4 7 d
5 9 e

You can also use inner_join from dplyr:

library(dplyr)
df %>% 
  inner_join(g, by = c("x" = "xx"))

intersect can be useful too

df[intersect(df$x, g$xx),]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading