# Filter groups based on difference two highest values

I have the following dataframe called df (`dput` below):

``````> df
group value
1      A     5
2      A     1
3      A     1
4      A     5
5      B     8
6      B     2
7      B     2
8      B     3
9      C    10
10     C     1
11     C     1
12     C     8
``````

I would like to filter groups based on the difference between their highest value (max) and second highest value. The difference should be smaller equal than 2 (<=2), this means that group B should be removed because the highest value is 8 and the second highest value is 3 which is a difference of 5. The desired output should look like this:

``````  group value
1     A     5
2     A     1
3     A     1
4     A     5
5     C    10
6     C     1
7     C     1
8     C     8
``````

So I was wondering if anyone knows how to filter groups based on the difference between their highest and second-highest value?

`dput` of df:

``````df<-structure(list(group = c("A", "A", "A", "A", "B", "B", "B", "B",
"C", "C", "C", "C"), value = c(5, 1, 1, 5, 8, 2, 2, 3, 10, 1,
1, 8)), class = "data.frame", row.names = c(NA, -12L))
``````

### >Solution :

Using `dplyr`

``````library(dplyr)

df %>%
group_by(group) %>%
filter(abs(diff(sort(value, decreasing=T)[1:2])) <= 2) %>%
ungroup()
# A tibble: 8 × 2
group value
<chr> <int>
1 A         5
2 A         1
3 A         1
4 A         5
5 C        10
6 C         1
7 C         1
8 C         8
``````

A base R alternative

``````grp <- aggregate(. ~ group, df, function(x)
abs(diff(sort(x, decreasing=T)[1:2])) <= 2)

do.call(rbind, c(mapply(function(g, v)
list(df[df\$group == g & v,]), grp\$group, grp\$value), make.row.names=F))
group value
1     A     5
2     A     1
3     A     1
4     A     5
5     C    10
6     C     1
7     C     1
8     C     8
``````