I have a dataset, df, and I wanted to measure some column-wise indicators. I want to extract the proportion of instances of value 1 in my var4 column, and also measure the mean of var3, and finally return them as text. I need to do that by creating a function and I did try with my basic understanding of R and dplyr, but I got errors. I would appreciate if anyone could let me know a better way to formulate my function.
library(dplyr)
set.seed(1)
ch <- sample(LETTERS[1:5], size = 20, replace = TRUE)
cls <- sample(c("CLASS ONE", "CLASS TWO"), size = 20, replace = TRUE)
var1 <- sample(1:20, size = 20, replace = TRUE)
var2 <- sample(20:40, size = 20, replace = TRUE)
var3 <- sample(40:60, size = 20, replace = TRUE)
var4 <- sample(c(0,1), size = 20, replace = TRUE)
df <- data.frame(ch, cls, var1, var2, var3, var4, stringsAsFactors = TRUE)
funcA <- function(col1, col2) {
# for measuring percentage of col1== 1 in length of col1
select1 = df %>% select(col1) %>% filter(col1 == 1)
prop1 = (length(selected$col1)/length(df$col1))*100
# for measuring median of col2
select2 = df %>% select(col2)
avg = mean(select2)
# pasting them in texts
paste("The percentage of value 1 in", "col1", "is: ", prop1)
paste("The mean value of", "col2", "is: ", avg)
}
funcA(var4, var3)
>Solution :
Here’s an option that also uses glue:
library(dplyr)
library(glue)
set.seed(1)
ch <- sample(LETTERS[1:5], size = 20, replace = TRUE)
cls <- sample(c("CLASS ONE", "CLASS TWO"), size = 20, replace = TRUE)
var1 <- sample(1:20, size = 20, replace = TRUE)
var2 <- sample(20:40, size = 20, replace = TRUE)
var3 <- sample(40:60, size = 20, replace = TRUE)
var4 <- sample(c(0,1), size = 20, replace = TRUE)
df <- data.frame(ch, cls, var1, var2, var3, var4, stringsAsFactors = TRUE)
funcA <- function(col1, col2) {
select1 <- df %>%
reframe(
prop = prop.table(table({{col1}})),
avg = mean({{col2}})
)
out <- list(
prop = paste0("The percentage of value 1 in ", as.list(match.call())[[2]], " is: ", unique(select1$prop)),
percent = paste0("The mean value of ", as.list(match.call())[[3]], " is: ", unique(select1$avg))
)
glue::glue_collapse(out, sep = "\n\n")
}
funcA(var4, var3)
#> The percentage of value 1 in var4 is: 0.5
#>
#> The mean value of var3 is: 51.1