I’m trying to build a function that takes two sorts of inputs, either numeric or character, changes them or leaves them as they are given class, then filters a dataframe by those arguments.
library(tidyverse)
fun1 = function(df,filt_col,filt_term_1,filt_term_2){
# changing the filt_col to symbol which is need to correctly parse things
filt_col = sym(filt_col)
# if statement that checks whether the filtering term is numeric or not
# if it is numeric it leaves as is, whilst if not it deparse(substitutes) (i.e. makes into quoted text)
if (!is.numeric(filt_term_1)) {filt_term_1 = deparse(substitute(filt_term_1))}
if (!is.numeric(filt_term_2)) {filt_term_2 = deparse(substitute(filt_term_2))}
# doing one of two things depending on filtering terms that have been provided as arguments
# if numeric, then filter < and > than numbers provided
# if character, then filter == to argument provided
if(is.numeric(filt_term_1) & is.numeric(filt_term_2)) {
group1 = df %>% filter(!!filt_col < filt_term_1)
group2 = df %>% filter(!!filt_col > filt_term_2)
} else {
group1 = df %>% filter(!!filt_col == filt_term_1)
group2 = df %>% filter(!!filt_col == filt_term_2)
}
# put two groups in a list
grouped_list = list(group1,group2)
return(grouped_list)
}
# trying function which runs well with numeric args
fun1(iris,"Sepal.Length",4.9,4.9)
# but does not run with character args
fun1(iris,"Species",versicolor,virginica)
Firstly, I’m not sure what the error is about. Secondly, how can I make this more efficient? Ideally I would want to enter all arguments as non-quoted text.
Thank you.
>Solution :
The problem is the following three lines of conditions when parsing unquoted expressions to filt_term_1 and filt_term_2:
if (!is.numeric(filt_term_1))if (!is.numeric(filt_term_2))if(is.numeric(filt_term_1) & is.numeric(filt_term_2))
If filt_term_* is a numeric or character these expressions can be evaluated as they will be represented as atomic vectors. In the case of an object being passed, like the unquoted versicolor it’ll fail: This object does not exist and cannot evaluated outside a context.
We could probably think of various work arounds, but to avoid an XY problem, in your case, I’d propose to let the type of the variable in the dataset determine how the inputs should be treated. Not the type of input.
E.g.
library(tidyverse)
fun1 = function(df, filt_col, filt_term_1, filt_term_2){
# changing the filt_col to symbol which is need to correctly parse things
filt_col = sym(filt_col)
# if statement that checks whether the filtering term is numeric or not
# if it is numeric it leaves as is, whilst if not it deparse(substitutes) (i.e. makes into quoted text)
if (!is.numeric(pull(df, {{filt_col}}))) {filt_term_1 = deparse(substitute(filt_term_1))}
if (!is.numeric(pull(df, {{filt_col}}))) {filt_term_2 = deparse(substitute(filt_term_2))}
# doing one of two things depending on filtering terms that have been provided as arguments
# if numeric, then filter < and > than numbers provided
# if character, then filter == to argument provided
if(is.numeric(pull(df, {{filt_col}}))) {
group1 = df %>% filter(!!filt_col < filt_term_1)
group2 = df %>% filter(!!filt_col > filt_term_2)
} else {
group1 = df %>% filter(!!filt_col == filt_term_1)
group2 = df %>% filter(!!filt_col == filt_term_2)
}
# put two groups in a list
grouped_list = list(group1,group2)
return(grouped_list)
}
# trying function which runs well with numeric args
fun1(iris,"Sepal.Length",4.9,4.9)
# but does not run with character args
fun1(iris,"Species",versicolor,virginica)