I’m having a large dataset with a value called CPE and a percentage PERC. In real world it’s slightly different, but I reduced it to the minimum necessary.
| CPE| PERC|
|---:|---------:|
| 42| 0.1132664|
| 264| 0.9260718|
| 470| 0.3732287|
| 316| 0.7437126|
| 9| 0.5819554|
| 114| 0.2052649|
Now I want to filter. Smaller amounts of CPE require a higher PERC to get in focus.
Surely I could do it this way:
df[(CPE > 20 & PERC > 0.95) |
(CPE > 50 & PERC > 0.9) |
(CPE > 100 & PERC > 0.85) |
(CPE > 250 & PERC > 0.8)]
But as there can be many more in real world and additionally they can vary from time to time, I’m looking for an easier and more dynamic solution.
Something like creating a list with all combinations and then using each combination as a filter-condition-pair.
list(c(20, 0.95), c(50, 0.9), c(100,0.85), c(250,0.8))
Is there a smart way to handle this? Even if I’m preferring data.table, dplyr is great as well. And it haven’t be a list, I’m open for any similar approach.
MWE
library(data.table)
set.seed(33)
df = data.table(CPE=sample(1:500, 100),
PERC=runif(min = 0.1, max = 1, n=100))
>Solution :
If the list is already created, we can loop over the list, create the logical condition with the first and second element using CPE and PERC columns respectively, Reduce to a single logical vector with | and subset the data
df[Reduce(`|`, lapply(lst1, \(x) CPE > x[1] & PERC > x[2]))]
data
lst1 <- list(c(20, 0.95), c(50, 0.9), c(100,0.85), c(250,0.8))