dplyr filter function not working to filter my dataframe in R

March 29, 2023

I have a dataframe in R with two columns. The datatype/class of the first column is "character". However there are numerics embedded within it … but I presumed these are still technically characters since when I run the function class(column_name) it returns "character".

I am trying to filter the dataframe using the dplyr filter function. I want the filter function to return the same dataframe, but without the rows where the column ‘doc_id’ contains "(2).txt" at the end.

I have been trying many things but none have worked.

I have tried:

constitutions <- constitutions %>% filter(!str_detect(doc_id, "(2).txt"))

constitutions <- constitutions[constitutions$doc_id %in% "(2).txt == FALSE]

constitutions %>% filter(!str_detect(doc_id, "(2).txt"))

*Note: This one ^ seems to have gotten rid of only a few of them, but not close to all.

constitutions <- subset(constitutions, !"(2).txt" %in% doc_id)

constitutions <- subset(constitutions, !("(2).txt" %in% consitutions$doc_id))

And MANY more iterations … what am I missing?

P.S. An example of a doc_id column value I am trying to remove from the constitutions dataframe is:

Brazil_1988_rev_2017 (2).txt

Would using a regex within one of the functions above work? I am lost, and running out of ideas.
Any help would be much appreciated.

>Solution :

Does escaping the parenthesis and period like this solve the problem?

constitutions <- constitutions %>% filter(!str_detect(doc_id, "\\(2\\)\\.txt"))

Parenthesis and periods (and a bunch of other symbols) are all special symbols in regular expressions. To look for a literal parenthesis or period, you have to escape using backslashes. For example:

This works:

> "document(2).txt" %>% str_detect("\\(2\\)\\.txt")
[1] TRUE

This doesn’t:

> "document(2).txt" %>% str_detect("(2).txt")
[1] FALSE

Here’s a link to more about regular expressions. The whole chapter is useful, but here’s the section about escaping: https://r4ds.hadley.nz/regexps.html#sec-regexp-escaping