Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filling in Missing Values in R Based on Previous Rows within Group

I have a dataset in long-format called df containing four key variables: respondent_id, time (the time period), change (a variable indicating the last period before a change of income) and lost_income, a variable indicating when a person lost their income.

The lost_income column is currently empty, only in rows that follow a row where change == -1 does lost_income take on the value 0, indicating that an individual lost their income at that point.

I want to write R code that continues the count of lost_income into both directions, within each group of respondent_id. That means that for each respondent_id, I want the lost_income column, currently containing 0 or NA, to be filled into both directions so that the row preceding the 0 takes on the value -1, the one before that -2 etc. Similarly, the first row after lost_sen_timing==0 should take on the value 1, the next 2, and so on.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Example Data:

   library(data.table)
    df <- data.table(respondent_id = c(1,1,1,1,1,1,2,2,2,2,2),
                            time = c(1,2,3,4,5,6,1,2,3,4,5),
                            change = c(0,0,0,-1,0,0,0,-1,0,0,0),
                            lost_income= c(NA,NA,NA,NA,0,NA,NA,NA,0,NA,NA))

I have tried using fill() function from the tidyr package but it only changes the first row after a non-missing value of lost_income. I want to adapt the code so that it changes, for example, the sequence of respondent 1 from NA,NA,NA,NA,0,NA to -4,-3,-2,-1,0,1 instead of NA,NA,NA,NA,0,1.

How can I fill in missing values in R based on previous rows within group to get the desired lost_income column as shown below?

Desired output:

respondent_id time change   lost_income
             1    1      0            -4
             1    2      0            -3
             1    3      0            -2
             1    4     -1            -1
             1    5      0             0
             1    6      0             1
             2    1      0            -2
             2    2     -1            -1
             2    3      0             0
             2    4      0             1
             2    5      0             2

Any suggestions on how to achieve this in R would be greatly appreciated. Thank you in advance!

>Solution :

library(dplyr)
df %>%
  left_join(
    df %>%
      filter(lost_income == 0) %>%
      select(respondent_id, time0 = time)
  ) %>%
  mutate(lost_income = time - time0)

Result

Joining with `by = join_by(respondent_id)`
# A tibble: 11 × 5
   respondent_id  time change lost_income time0
           <dbl> <dbl>  <dbl>       <dbl> <dbl>
 1             1     1      0          -4     5
 2             1     2      0          -3     5
 3             1     3      0          -2     5
 4             1     4     -1          -1     5
 5             1     5      0           0     5
 6             1     6      0           1     5
 7             2     1      0          -2     3
 8             2     2     -1          -1     3
 9             2     3      0           0     3
10             2     4      0           1     3
11             2     5      0           2     3

Or, using dplyr 1.1.0:

df %>%
  mutate(lost_income = time - time[lost_income %in% 0], .by = respondent_id)

Result

# A tibble: 11 × 4
   respondent_id  time change lost_income
           <dbl> <dbl>  <dbl>       <dbl>
 1             1     1      0          -4
 2             1     2      0          -3
 3             1     3      0          -2
 4             1     4     -1          -1
 5             1     5      0           0
 6             1     6      0           1
 7             2     1      0          -2
 8             2     2     -1          -1
 9             2     3      0           0
10             2     4      0           1
11             2     5      0           2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading