How to update information for a variable depending on other variable in r?

December 5, 2021

df_input is the data frame I have and I want to transform it into df_output. The main goal is how I can update the same information as in the winner column depending on "assembly". For instance, as the year 2001-2003 is assembly=1 and we have a winner in 2001 it means we have a winner as long as the assembly doesn’t change.

df_input <- data.frame(winner  = c(1,0,0,0,2,0,0,0,1,0,0,0,0), 
                       assembly= c(1,1,1,2,2,2,3,3,3,3,4,4,4), 
                       year = c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013))    

df_output <- data.frame(winner  = c(1,1,1,0,2,2,0,0,1,1,0,0,0), 
                       assembly= c(1,1,1,2,2,2,3,3,3,3,4,4,4), 
                       year = c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013))

I don’t have a clue where to start this? Any help would be appreciated.

>Solution :

One option would be to use tidyr::fill like so:

library(dplyr)
library(tidyr)   

df_input %>%
  mutate(winner = if_else(winner > 0, winner, NA_real_)) %>% 
  group_by(assembly) %>% 
  fill(winner) %>% 
  ungroup() %>% 
  replace_na(list(winner = 0))
#> # A tibble: 13 × 3
#>    winner assembly  year
#>     <dbl>    <dbl> <dbl>
#>  1      1        1  2001
#>  2      1        1  2002
#>  3      1        1  2003
#>  4      0        2  2004
#>  5      2        2  2005
#>  6      2        2  2006
#>  7      0        3  2007
#>  8      0        3  2008
#>  9      1        3  2009
#> 10      1        3  2010
#> 11      0        4  2011
#> 12      0        4  2012
#> 13      0        4  2013