Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R – change the first two times of a variable in each row into NAs

I want the percentage of ones for each year, so the percentage for each column. My problem is now that I have to exclude the first two ones of each row because at that point the individual are to young to be included into my analysis. I tried to change the first two ones into NAs, so I still know that there was a one but it is not included into my analysis/calculations.
The first six rows of my data set (df) looks like the following:

    2007 2008 2009 2010 2011 2012 2013 2014
   1    1    1    1    1   1     1    1    1
   2    0    1    1    1   0     0    0    0
   3    1    1    1    1   1     1    1    1
   4    1    1    1    0   0     0    0    0
   5    0    1    1    1   0     0    0    0
   6    1    1    1    1   1     1    1    1 

The data set should look like the following | expected output:

  2007 2008 2009 2010 2011 2012 2013 2014
 1  NA   NA    1    1   1     1    1    1
 2  0    NA   NA    1   0     0    0    0
 3  NA   NA    1    1   1     1    1    1
 4  NA   NA    1    0   0     0    0    0
 5  0    NA   NA    1   0     0    0    0
 6  NA   NA    1    1   1     1    1    1 

I tried different formulars. Most of them did not worked at all.
The following code at least worked but did not do any change in my data set. Any help would be really appreciated.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

 df2 <- df %>% 
  transmute(across(.cols = everything(), .fns = NULL, 
                   (length(x<-which(myRow == 1)) == length(x+1)), NA))

I also tried the following but there I got an error:

 df3 <- transmute_if (df,(length(x<-which(myRow == 1)) == length(x+1)), return(NA))

Error: .predicate must have length 1, not 14.

>Solution :

Here is a base R way.

df1 <- read.table(text = "
2007 2008 2009 2010 2011 2012 2013 2014
   1    1    1    1    1   1     1    1    1
   2    0    1    1    1   0     0    0    0
   3    1    1    1    1   1     1    1    1
   4    1    1    1    0   0     0    0    0
   5    0    1    1    1   0     0    0    0
   6    1    1    1    1   1     1    1    1
", header = TRUE, check.names = FALSE)

f <- function(x){
  i <- which(x == 1)
  if(length(i) ==  1L) {
    is.na(x) <- i
  } else if (length(i >= 2L)) {
    is.na(x) <- i[1:2]
  }
  x
}
t(apply(df1, 1, f))
#>   2007 2008 2009 2010 2011 2012 2013 2014
#> 1   NA   NA    1    1    1    1    1    1
#> 2    0   NA   NA    1    0    0    0    0
#> 3   NA   NA    1    1    1    1    1    1
#> 4   NA   NA    1    0    0    0    0    0
#> 5    0   NA   NA    1    0    0    0    0
#> 6   NA   NA    1    1    1    1    1    1

Created on 2022-03-15 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading