I am trying to filter a data frame using dplyr and I can’t really think of a way to achieve what I want. I have a data frame of the following form:
A B C
-----------
1 2 5
1 4 6
2 2 7
2 4 6
Each value in column A appears exactly 2 times. Column B has exactly 2 different values, each appearing exactly once for each value of A. Column C can have any positive values. I want to keep all rows where for one value of A, the row with the bigger B value has a smaller C value than the row with the smaller B value. In the example above, this would result in:
A B C
-----------
2 2 7
2 4 6
Is there a way to achieve this using dplyr?
>Solution :
Sort by A and B to ensure that the larger B is always the second within A and then grouping by A use a filter based on diff(C) < 0.
library(dplyr)
DF %>%
arrange(A, B) %>%
group_by(A) %>%
filter((diff(C) < 0)) %>%
ungroup
giving:
# A tibble: 2 × 3
A B C
<int> <int> <int>
1 2 2 7
2 2 4 6
Note
Lines <- "A B C
1 2 5
1 4 6
2 2 7
2 4 6"
DF <- read.table(text = Lines, header = TRUE)