I am processing strings in R which are supposed to contain zero or one pair of parentheses. If there are nested parentheses I need to delete the inner pair. Here is an example where I need to delete the parentheses around big bent nachos but not the other/outer parentheses.
test <- c(
"Record ID",
"What is the best food? (choice=Nachos)",
"What is the best food? (choice=Tacos (big bent nachos))",
"What is the best food? (choice=Chips with stuff)",
"Complete?"
)
I know I can kill all the parentheses with the stringr package using str_remove_all():
test |>
stringr::str_remove_all(stringr::fixed(")")) |>
stringr::str_remove_all(stringr::fixed("("))
but I don’t have the RegEx skills to pick the inner parentheses. I found a SO post that is close but it removes the outer parentheses and I cant untangle it to remove the inner.
>Solution :
Here is a solution using gsub from base R. It is broken down into 2 steps for readability and debugging.
test <- c(
"Record ID",
"What is the best food? (choice=Nachos)",
"What is the best food? (choice=Tacos (big bent nachos))",
"What is the best food? (choice=Chips with stuff)",
"Complete?"
)
test <- gsub("(\\(.*)\\(", "\\1", test)
# ( \\(.* ) - first group starts with '(' then zero or more characters following that first '('
# \\( - middle part look of a another '('
# "\\1" replace the found group with the part from the first group
test <-gsub("\\)(.*\\))", "\\1", test)
#similer to first part
test
[1] "Record ID"
[2] "What is the best food? (choice=Nachos)"
[3] "What is the best food? (choice=Tacos big bent nachos)"
[4] "What is the best food? (choice=Chips with stuff)"
[5] "Complete?"