Home R: How to subset dataframe over columns from 2 different dataframe?

Questions

R: How to subset dataframe over columns from 2 different dataframe?

April 4, 2023

How to subset dataframe over columns from 2 different dataframe using R code?

Here is the dummy code:

library(dplyr)
data <- data.frame(b = rep(LETTERS[1:4],2), c = c("B", "A", "A", "E", "G", "H", "K", "L"))

#   b c
# 1 A B
# 2 B A
# 3 C A
# 4 D E
# 5 A G
# 6 B H
# 7 C K
# 8 D L


data2 <- data.frame(d = c("A", "B", ""), e = c("E", "", "C"))
#    d e
#1   A E
#2   B  
#3     C


subset <- subset(data, data$b %in%  c(data2$d, data2$e))

#   b c
# 1 A B
# 2 B A
# 3 C A
# 5 A G
# 6 B H
# 7 C K

As you can see, i can use subset() function to overlap "data" to "data2". But what if i have large number of columns in "data2"? is there a way to simplify this code? If possible tidyverse approach is preferred.

I tried to use below code, but its not working.

subset_try <- subset(data, data$b %in%  data2[,c(1:2)])
#[1] b c
#<0 rows> (or 0-length row.names)

Thank you.

>Solution :

If there are lots of columns, unlist into a vector and subset

subset(data, b %in% unlist(data2))

If we want only a subset of columns, then select the subset of columns and unlist – Note that data2[, c(1, 2)] is still a data.frame with two columns and not a vector, thus when we do

> data$b %in% data2
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

It has to do with the table argument with %in%. According to ?"%in%"

x %in% table

table – vector or NULL: the values to be matched against. Long vectors are not supported.

Therefore, we may want to convert to vector

> data$b %in% unlist(data2)
[1]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE
> subset(data, b %in% unlist(data2[1:2]))

For the tidyverse, it is just replacing the subset with filter

library(dplyr)
filter(data, b %in% unlist(data2[1:2]))

tidyverse

byMR

Published April 04, 2023

Add a comment

Bar ggplot with two y-axes

byMR

April 4, 2023

Questions

Limit list data in dart

byMR

April 4, 2023

Questions

How to "safely" use cin with a char

byMR

April 4, 2023

Questions

Monitor batch job by Prometheus

byMR

April 4, 2023

Questions

What are the performance and memory differences between a local array and a static const local array?

byMR

April 4, 2023

R: How to subset dataframe over columns from 2 different dataframe?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Bar ggplot with two y-axes

Limit list data in dart

How to "safely" use cin with a char

Monitor batch job by Prometheus

What are the performance and memory differences between a local array and a static const local array?

Keep Up to Date with the Most Important News

R: How to subset dataframe over columns from 2 different dataframe?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Bar ggplot with two y-axes

Value not selected in dropdown flutter

Limit list data in dart

How to "safely" use cin with a char

Monitor batch job by Prometheus

What are the performance and memory differences between a local array and a static const local array?

Discover more from Dev solutions