I have a string which always contains unwanted text at the end. I would like to extract everything but the unwanted text.
text <- "my_text_and_unwanted_text"
output <- str_extract(text, ".*(?=<_and)")
output
I am hoping that ".*" matches all text that precedes anything with "_and". So the intended result is "my text" but I get "NA". I have reviewed a number of posts but having trouble finding examples that show how to match everything but the desired string.
>Solution :
Another way to think of this operation is to replace the unwanted text with nothing rather than extract everything else. This is often simpler.
text <- "my_text_and_unwanted_text"
str_replace(text, "_and.*", "")
# [1] "my_text"
For the extracting approach, your attempt was very close. (?<= is for look-behind, you need (?= for look-ahead
str_extract(text, ".*(?=_and)")
# [1] "my_text"