Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Randomizing a distribution of data in a list

I have a data frame df that I would like to separate into a training set and a test set. Instead of getting only a single training and test set, I would like to get a distribution of them (n = 100).

I try and do this with lapply, but the values for each element in the list end up being exactly the same. How do I randomize the values in the two list (i.e., train.data and test.data)?

The expected output would be a list for both train.data and test.data, each containing 100 elements with different subsets of df in both of them.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(lubridate)
library(tidyverse)
library(caret)

date <- rep_len(seq(dmy("01-01-2013"), dmy("31-12-2013"), by = "days"), 300)
ID <-  rep(c("A","B","C"), 50)
class <-  rep(c("N","M"), 50)
df <- data.frame(value  = runif(length(date), min = 0.5, max = 25),
                 ID, 
                 class)
training.samples <- df$class %>% 
  createDataPartition(p = 0.6, list = FALSE)


n <- 100

train.data  <- lapply(1:n, function(x){
  df[training.samples, ]
})
test.data <- lapply(1:n, function(x){
  df[-training.samples, ]
})

>Solution :

Try using replicate

f1 <- function(dat, colnm) {
  s1 <- createDataPartition(dat[[colnm]], p = 0.6,
     list = FALSE)
  return(list(train.data = dat[s1,], test.data = dat[-s1,]))
}
n <- 100
out <- replicate(n, f1(df, "class"), simplify = FALSE)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading