Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: Randomly Replace Values with NA

I am working the R programming language. I am trying select 10% of the elements in my dataset (excluding elements in the first column) and replace them with NA. I tried to do this with the following code:

 library(longitudinalData)
 data(artificialLongData)

second_dataset = artificialLongData
second_dataset[sample(nrow(second_dataset),0.1*nrow(second_dataset ))]<- NA

This produces the following error:

Error in `[<-.data.frame`(`*tmp*`, sample(nrow(second_dataset), 0.1 *  : 
  new columns would leave holes after existing columns

Can someone please show me how to fix this problem?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thanks!

Note: The final result should look something like this:

  id    t0    t1    t2    t3    t4    t5    t6    t7    t8    t9   t10
1 s1  NA  NA -1.85 -2.05  1.01  1.56  NA  0.52 -0.06 -1.09  0.44
2 s2 -4.88 -2.95 -2.38  3.73 -2.77  1.72 -0.99 -0.70  NA  2.38 -0.72
3 s3  NA -0.86  NA -2.04 -1.18  4.89 NA  0.50  4.90 -0.52  NA

>Solution :

You could replace random elements in lapply.

set.seed(42)
as.data.frame(lapply(dat, \(x) replace(x, sample(length(x), .1*length(x)), NA)))
#    X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
# 1  NA  7 NA 10  3 11  4  4 NA   7
# 2   6  6  8  8  4 11 NA  8 10   9
# 3   1 12  4  5 12  3 10  3 11   1
# 4   3 10  6  2 11 NA  3 11  2  11
# 5   8 NA 10 12  5  7  2  9  4  10
# 6  12  4  9 12  9  2  7  9  8   8
# 7   7  5  9  4  2 12 12  3  4   4
# 8  12  5  3  1  6  1  4  7  6  NA
# 9   4  6 12 NA  5  8  4  4  6   7
# 10  3  2 11  3 NA  5  4 NA  2   4

Data:

dat <- structure(list(X1 = c(6L, 6L, 1L, 3L, 8L, 12L, 7L, 12L, 4L, 3L
), X2 = c(7L, 6L, 12L, 10L, 12L, 4L, 5L, 5L, 6L, 2L), X3 = c(1L, 
8L, 4L, 6L, 10L, 9L, 9L, 3L, 12L, 11L), X4 = c(10L, 8L, 5L, 2L, 
12L, 12L, 4L, 1L, 3L, 3L), X5 = c(3L, 4L, 12L, 11L, 5L, 9L, 2L, 
6L, 5L, 3L), X6 = c(11L, 11L, 3L, 9L, 7L, 2L, 12L, 1L, 8L, 5L
), X7 = c(4L, 10L, 10L, 3L, 2L, 7L, 12L, 4L, 4L, 4L), X8 = c(4L, 
8L, 3L, 11L, 9L, 9L, 3L, 7L, 4L, 8L), X9 = c(12L, 10L, 11L, 2L, 
4L, 8L, 4L, 6L, 6L, 2L), X10 = c(7L, 9L, 1L, 11L, 10L, 8L, 4L, 
12L, 7L, 4L)), class = "data.frame", row.names = c(NA, -10L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading