How do I select variables that contain a certain string in the values

Advertisements

I need to select any character variables in a data frame but exclude any that have correctly accented French variables.

var1<-rep(c("fran\0xC3cais", "english"), 100)
var2<-rnorm(200)
var3<-rep(c("français", "english"), 100)
df<-data.frame(var1=var1, var2=var2, var3=var3)
df %>% 
  select(!where(str_detect(., "ç")))

>Solution :

library(stringr)
library(dplyr)

df |> 
  select(where(~ !any(str_detect(., fixed("ç")))))

where expects a function that returns a single logical value for each column. ~ creates an anonymous function in the tidyverse (you could have also used \(x) or function(x) too).

The use of any ensures a single logical vector if any values in the column match your pattern.

You can also add logic like so ~ !any(str_detect(., fixed("ç"))) & is.character(.)

Leave a ReplyCancel reply