df_input is the data frame I have and I want to transform it into df_output. The main goal is how I can update the same information as in the winner column depending on "assembly". For instance, as the year 2001-2003 is assembly=1 and we have a winner in 2001 it means we have a winner as long as the assembly doesn’t change.
df_input <- data.frame(winner = c(1,0,0,0,2,0,0,0,1,0,0,0,0),
assembly= c(1,1,1,2,2,2,3,3,3,3,4,4,4),
year = c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013))
df_output <- data.frame(winner = c(1,1,1,0,2,2,0,0,1,1,0,0,0),
assembly= c(1,1,1,2,2,2,3,3,3,3,4,4,4),
year = c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013))
I don’t have a clue where to start this? Any help would be appreciated.
>Solution :
One option would be to use tidyr::fill like so:
library(dplyr)
library(tidyr)
df_input %>%
mutate(winner = if_else(winner > 0, winner, NA_real_)) %>%
group_by(assembly) %>%
fill(winner) %>%
ungroup() %>%
replace_na(list(winner = 0))
#> # A tibble: 13 × 3
#> winner assembly year
#> <dbl> <dbl> <dbl>
#> 1 1 1 2001
#> 2 1 1 2002
#> 3 1 1 2003
#> 4 0 2 2004
#> 5 2 2 2005
#> 6 2 2 2006
#> 7 0 3 2007
#> 8 0 3 2008
#> 9 1 3 2009
#> 10 1 3 2010
#> 11 0 4 2011
#> 12 0 4 2012
#> 13 0 4 2013