Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

dplyr function for measuring mean and proportion of a particular value in different columns

I have a dataset, df, and I wanted to measure some column-wise indicators. I want to extract the proportion of instances of value 1 in my var4 column, and also measure the mean of var3, and finally return them as text. I need to do that by creating a function and I did try with my basic understanding of R and dplyr, but I got errors. I would appreciate if anyone could let me know a better way to formulate my function.

library(dplyr)

set.seed(1)
ch   <- sample(LETTERS[1:5], size = 20, replace = TRUE)
cls  <- sample(c("CLASS ONE", "CLASS TWO"), size = 20, replace = TRUE)
var1 <- sample(1:20, size = 20, replace = TRUE)
var2 <- sample(20:40, size = 20, replace = TRUE)
var3 <- sample(40:60, size = 20, replace = TRUE)
var4 <- sample(c(0,1), size = 20, replace = TRUE)

df <- data.frame(ch, cls, var1, var2, var3, var4, stringsAsFactors = TRUE)

funcA <- function(col1, col2) {
    # for measuring percentage of col1== 1 in length of col1 
    select1   =  df %>% select(col1) %>% filter(col1 == 1)
    prop1     =  (length(selected$col1)/length(df$col1))*100
    # for measuring median of col2
    select2   = df %>% select(col2) 
    avg       = mean(select2)
    # pasting them in texts
    paste("The percentage of value 1 in", "col1", "is: ", prop1)
    paste("The mean value of", "col2", "is: ", avg)
}


funcA(var4, var3)

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Here’s an option that also uses glue:

library(dplyr)
library(glue)

set.seed(1)
ch   <- sample(LETTERS[1:5], size = 20, replace = TRUE)
cls  <- sample(c("CLASS ONE", "CLASS TWO"), size = 20, replace = TRUE)
var1 <- sample(1:20, size = 20, replace = TRUE)
var2 <- sample(20:40, size = 20, replace = TRUE)
var3 <- sample(40:60, size = 20, replace = TRUE)
var4 <- sample(c(0,1), size = 20, replace = TRUE)

df <- data.frame(ch, cls, var1, var2, var3, var4, stringsAsFactors = TRUE)

funcA <- function(col1, col2) {
  
  select1 <- df %>% 
    reframe(
      prop = prop.table(table({{col1}})),
      avg = mean({{col2}})
    )
  
  out <- list(
    prop = paste0("The percentage of value 1 in ", as.list(match.call())[[2]], " is: ", unique(select1$prop)),
    percent = paste0("The mean value of ", as.list(match.call())[[3]], " is: ", unique(select1$avg))
  )
  
  glue::glue_collapse(out, sep = "\n\n")
  
}

funcA(var4, var3)
#> The percentage of value 1 in var4 is: 0.5
#> 
#> The mean value of var3 is: 51.1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading