Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Vectorised argument for a function in R. The function gives out multiple data frames, whereas I'd like it to output only one

I’d like to compute trimmed mean for each trimming proportion alpha, and then see which trimming proportion gives the minimal variance of the trimmed means, when Bootstrap simulations of size N=200 are applied. The problem that I have, is that when I try to create a data frame of column1 = mean and column2 = variance, the code that I wrote creates each output of mean and variance as separate data frame, so I cannot look up the trimming proportion and trimmed mean which have the minimal variance.

The function gives out "data.frame" 9 times. I guess it’s because the alpha argument is vectorized. The code:

 tmean_var <- function(n,N,alpha){
      set.seed(1)
      data <- rnorm(n)
      data_aug1 <- c(data, -data)
      data_aug2 <- c(data, 2 * median(data) - data)
      est <- data.frame()
      tmean <- replicate(N, {
      sample <- base::mean(sample(x = data[(round(alpha*n)+1):(n-round(alpha*n))],
                                    size = n-2*round(alpha*n), replace = TRUE))
    })
      
    mean <- base::mean(tmean)
    var <- var(tmean) * (n-2*round(alpha * n))
    df <- data.frame(mean = mean, var = var)
    class(df)
    }
    f <- Vectorize(tmean_var, vectorize.args = "alpha")
    f(n,N,alpha)

How could I make the output to be one dataframe not nine?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

This should do it. Rather than try to use Vectorize() on a function that doesn’t inherently take vector arguments, you could just use sapply() and lapply() across the values of alpha you provide as below:

tmean_var <- function(n,N,alpha){
  set.seed(1)
  data <- rnorm(n)
  data_aug1 <- c(data, -data)
  data_aug2 <- c(data, 2 * median(data) - data)
  est <- data.frame()
  tmean <- lapply(alpha, function(a){replicate(N, {
    sample <- base::mean(sample(x = data[(round(a*n)+1):(n-round(a*n))],
                                size = n-2*round(a*n), replace = TRUE))
    })
  })
  
  mean <- sapply(tmean, mean)
  var <- sapply(seq_along(tmean), function(i)var(tmean[[i]]) * (n-2*round(alpha[i] * n)))
  df <- data.frame(mean = mean, var = var, alpha=alpha)
#  class(df)
}

out <- tmean_var(100, 200, c(.1, .2, .3))
out
#>         mean       var alpha
#> 1 0.10555709 0.8066377   0.1
#> 2 0.06868891 0.8331401   0.2
#> 3 0.21791984 0.9024612   0.3

Created on 2022-05-13 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading