If I have a data frame like this:
df = data.frame(A = sample(1:5, 10, replace=T), B = sample(1:5, 10, replace=T), C = sample(1:5, 10, replace=T), D = sample(1:5, 10, replace=T), E = sample(1:5, 10, replace=T))
Giving me this:
A B C D E
1 1 5 1 4 3
2 2 3 5 4 3
3 4 2 2 4 4
4 2 1 2 5 2
5 3 3 4 4 5
6 3 2 3 1 5
7 1 5 4 2 3
8 1 3 5 5 1
9 3 1 1 3 5
10 5 3 1 2 4
How do I get a subset that includes all the rows where the values for certain columns (B and D, say) are equal to 1, with the columns identified by their index numbers (2 and 4) rather than their names? In this case:
A B C D E
4 2 1 2 5 2
6 3 2 3 1 5
9 3 1 1 3 5
>Solution :
df[rowSums(df[c(2,4)] == 1) > 0,]
# A B C D E
# 4 2 1 2 5 2
# 6 3 2 3 1 5
# 9 3 1 1 3 5
- You said to compare values by column index, so
df[c(2,4)]or (ordf[,c(2,4)]). df[c(2,4)] == 1returns a matrix of logicals, whether the cell’s value is equal to 1.rowSums(.) > 0finds those rows with at least one1.df[rowSums(.)>0,]selects just those rows.
Data
df <- structure(list(A = c(1L, 2L, 4L, 2L, 3L, 3L, 1L, 1L, 3L, 5L), B = c(5L, 3L, 2L, 1L, 3L, 2L, 5L, 3L, 1L, 3L), C = c(1L, 5L, 2L, 2L, 4L, 3L, 4L, 5L, 1L, 1L), D = c(4L, 4L, 4L, 5L, 4L, 1L, 2L, 5L, 3L, 2L), E = c(3L, 3L, 4L, 2L, 5L, 5L, 3L, 1L, 5L, 4L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))