Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using quoted variable names when creating functions containing ellipses (i.e. '…') in dplyr

I am trying to learn how to use the ellipses (...) when I program using dplyr. I cannot work out how to pass a character into the ellipses. Here is a toy problem to illustrate

set.seed(10)
data.frame(var1 = factor(sample(x = letters[1:3],
                                size = 10,
                                replace = T))) -> df

Now say I want to get simple frequencies of each level of the factor. I make a simple function to do that

levelFunct <- function(.data, ...) {
  .data %>%
    group_by(...) %>%
      summarise(count = n()) %>%
        mutate(tot = sum(count),
               perc = round(count/tot*100,2))
}

Now when I run the function this way, just passing the name of the variable into the ellipse argument without quotation marks

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

levelFunct(df, var1)

I get the following output

  var1  count   tot  perc
  <fct> <int> <int> <dbl>
1 a         1    10    10
2 b         2    10    20
3 c         7    10    70

So far so good. But if I pass the function into the ellipse argument with quotation marks

levelFunct(df, "var1")

I get the following output

  `"var1"` count   tot  perc
  <chr>    <int> <int> <dbl>
1 var1        10    10   100

So how do I replicate the result from the first run using a quoted variable name?

I tried enclosing the ellipse in the group_by function in double brackets ([[...]]) but just got an error.

>Solution :

In those situations, you must use the .data pronoun:

levelFunct(df, .data[["var1"]])

NOTE:

You can pass more parameters aside that, literal string or var name.

levelFunct(df, .data[["var1"]], var2, .....)

in general, methods described here are valid:
https://dplyr.tidyverse.org/articles/programming.html

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading