Fill only values between observations observed

April 14, 2023

Let suppose the following dataset :

ID	Number
AX	1
AX	NA
AX	NA
AX	2
AX	NA
AX	NA
AX	NA

I would like to only fill NA which are between real observations observed, such as :

ID	Number
AX	1
AX	1
AX	1
AX	2
AX	NA
AX	NA
AX	NA

I tried to do it with tidyr::fill (DF %>% fill(Number)) but i cannot reproduce those results because it fills all the NA values. How can i do it ?

>Solution :

Here’s one method that fills and restores the NA for those cases where follow-on non-NA values were not found.

library(dplyr)
library(tidyr) # fill
quux %>%
  group_by(ID) %>%
  mutate(hasmore = rev(cumany(!is.na(rev(Number))))) %>%
  fill(Number) %>%
  mutate(Number = if_else(hasmore, Number, Number[NA])) %>%
  ungroup()
# # A tibble: 7 × 3
#   ID    Number hasmore
#   <chr> <chr>  <lgl>  
# 1 AX    1      TRUE   
# 2 AX    1      TRUE   
# 3 AX    1      TRUE   
# 4 AX    2      TRUE   
# 5 AX    <NA>   FALSE  
# 6 AX    <NA>   FALSE  
# 7 AX    <NA>   FALSE

I use Number[NA] instead of just NA because I’m not certain if your Number is character, numeric, or integer; Number[NA] is never wrong, it will always do as intended, but using the wrong type of NA (there are at least ten kinds of NA) can produce an error.