Remove rows if a certain condition occur

August 10, 2023

I’m dealing with a massive dataset. Let me make an example

df=data.frame(country = c("France","France","France","France","Italy","Italy","Italy","Italy","Spain","Spain","Spain","Spain"),year=c(replicate(3,c(2000,2001,2002,2003))),X=c(seq(1:12)))

I’d remove all the rows associated with a given country if (according to this example) X > 7 in 2002. As a result, Spain shall disappear

>Solution :

You may take help of match to keep those countries whose value of X is less than equal to 7 in the year 2002.

library(dplyr)

df %>% filter(X[match(2002, year)] <= 7, .by = country)

#  country year X
#1  France 2000 1
#2  France 2001 2
#3  France 2002 3
#4  France 2003 4
#5   Italy 2000 5
#6   Italy 2001 6
#7   Italy 2002 7
#8   Italy 2003 8