I’m confused about this inconsistency in the tidyverse and am not sure what’s going on.
Test data:
test <- data.frame(test_gibberish = 1,
test_prob_gibberish = 2)
I now want to check if there is a column that ends with "_gibberish", but is not preceeded by "_prob".
This one works and returns the correct result:
stringr::str_detect(names(test), "(?<!_prob)_gibberish$")
[1] TRUE FALSE
However, this one returns an error, despite using exactly the same regex:
test |>
dplyr::select(tidyselect::matches("(?<!_prob)_gibberish$"))
Error in `dplyr::select()`:
! invalid regular expression '(?<!_prob)_gibberish$', reason 'Invalid regexp'
Run `rlang::last_error()` to see where the error occurred.
Warning message:
In grep(needle, haystack, ...) :
TRE pattern compilation error 'Invalid regexp'
Is my regex wrong? Is stringr wrong? Is tidyselect wrong?
>Solution :
By default, perl = FALSE in matches according to ?tidyselect::matches
matches(match, ignore.case = TRUE, perl = FALSE, vars = NULL)
test |>
dplyr::select(tidyselect::matches("(?<!_prob)_gibberish$", perl = TRUE))
-output
test_gibberish
1 1
The lookaround expression will be a valid perl expression