Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Dplyr variable names in function R

I’m trying to create a function using some dplyr functions and I think I’m running into issues with NSE. The below functions works when I use the actual name of the variables in the argument but when I try to call to the elements of the vectors that I made, it doesn’t.

I think I need to something about the quoting/unquoting of the arguments but I’m kind of stumped:

Works:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

 dat1 <- read.table(text = "x1 x2 y
10 20 50
20 30.5 100
30 40.5 200
40 20.12 400
50 25 500
70 86 600
80 75 700
90 45 800", header = TRUE)
 
 num_names <- paste(colnames(dat1[sapply(dat1, is.numeric)]))
 bin_names <- paste(colnames(dat1[sapply(dat1, is.numeric)]), "bin", sep = "_")
 dat1[bin_names] <- lapply(dat1[num_names], function(x) dplyr::ntile(x, n = 10))
 
 
 make_iv <- function(df, variable, bin_variable){
   
   
   df <- df
   ivv <- df %>%
     group_by({{bin_variable}}) %>%
     summarise(N_ = n(),
               min_x = min({{variable}}),
               max_x = max({{variable}}),
               SumY = sum(y),
               perc_obs = (n()/nrow(df)),
               ans = sum(perc_obs))
   
  
   return(ivv)
 }
 
 
 make_iv(df = dat1,
         variable = x1,
         bin_variable = x1_bin)

Does not work:

 dat1 <- read.table(text = "x1 x2 y
10 20 50
20 30.5 100
30 40.5 200
40 20.12 400
50 25 500
70 86 600
80 75 700
90 45 800", header = TRUE)
 
 num_names <- paste(colnames(dat1[sapply(dat1, is.numeric)]))
 bin_names <- paste(colnames(dat1[sapply(dat1, is.numeric)]), "bin", sep = "_")
 dat1[bin_names] <- lapply(dat1[num_names], function(x) dplyr::ntile(x, n = 10))
 
 
 make_iv <- function(df, variable, bin_variable){
   
   
   df <- df
   ivv <- df %>%
     group_by({{bin_variable}}) %>%
     summarise(N_ = n(),
               min_x = min({{variable}}),
               max_x = max({{variable}}),
               SumY = sum(y),
               perc_obs = (n()/nrow(df)),
               ans = sum(perc_obs))
   
  
   return(ivv)
 }
 
 
 make_iv(df = dat1,
         variable = num_names[1],
         bin_variable = bin_names[1])

>Solution :

You need to distinguish if you have variable name as symbol (not sure if this is good term) or as string. NSE refers to symbols, i.e. you do not write quotes. In your first example you use symbols, in second – strings. And for string another syntax is necessary. Instead of {{variable}} you need to use .data[[variable]]:

library(dplyr)

dat1 <- read.table(text = "x1 x2 y
10 20 50
20 30.5 100
30 40.5 200
40 20.12 400
50 25 500
70 86 600
80 75 700
90 45 800", header = TRUE)

num_names <- paste(colnames(dat1[sapply(dat1, is.numeric)]))
bin_names <- paste(colnames(dat1[sapply(dat1, is.numeric)]), "bin", sep = "_")
dat1[bin_names] <- lapply(dat1[num_names], function(x) dplyr::ntile(x, n = 10))


make_iv <- function(df, variable, bin_variable){
  
  
  df <- df
  ivv <- df %>%
    group_by(.data[[bin_variable]]) %>%
    summarise(N_ = n(),
              min_x = min(.data[[variable]]),
              max_x = max(.data[[variable]]),
              SumY = sum(y),
              perc_obs = (n()/nrow(df)),
              ans = sum(perc_obs))
  
  
  return(ivv)
}


make_iv(df = dat1,
        variable = num_names[1],
        bin_variable = bin_names[1])

If you haven’t see it, here is a source: Programming with dplyr

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading