Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Conditional filtering with data.table with multiple statements

I would like to know if there is an elegant and concise way to do conditional filtering with data.table.

My aim is the following:
if condition 1 is met, filter based on condition 2.

For instance, in the case of the iris dataset,
how can I drop the observations among Species=="setosa" where Sepal.Length<5.5, while keeping all observations with Sepal.Length<5.5 for other species?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I know how to do this in steps, but I wonder if there is a better way to do it in a single liner

# this is how I would do it in steps. 

data("iris")

# first only select observations in setosa I am interested in keeping 
iris1<- setDT(iris)[Sepal.Length>=5.5&Species=="setosa"] 

# second, drop all of setosa observations. 
iris2<- setDT(iris)[Species!="setosa"] 

# join data,
iris_final<-full_join(iris1,iris2)

head(iris_final)
   Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1:          5.8         4.0          1.2         0.2     setosa
2:          5.7         4.4          1.5         0.4     setosa
3:          5.7         3.8          1.7         0.3     setosa
4:          5.5         4.2          1.4         0.2     setosa
5:          5.5         3.5          1.3         0.2     setosa # only keeping setosa with Sepal.Length>=5.5. Note that for other species, Sepal.Length can be <5.5
6:          7.0         3.2          4.7         1.4 versicolor

is there a more concise and elegant way of doing this?

>Solution :

Is something like the following what you are looking for? It is not very clear what you want.

library(data.table)

dt <- data.table(iris)
dt[Sepal.Length >= 5.5 & Species == "setosa" | Species != "setosa"]

#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>   1:          5.8         4.0          1.2         0.2    setosa
#>   2:          5.7         4.4          1.5         0.4    setosa
#>   3:          5.7         3.8          1.7         0.3    setosa
#>   4:          5.5         4.2          1.4         0.2    setosa
#>   5:          5.5         3.5          1.3         0.2    setosa
#>  ---                                                            
#> 101:          6.7         3.0          5.2         2.3 virginica
#> 102:          6.3         2.5          5.0         1.9 virginica
#> 103:          6.5         3.0          5.2         2.0 virginica
#> 104:          6.2         3.4          5.4         2.3 virginica
#> 105:          5.9         3.0          5.1         1.8 virginica
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading