Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex to edit text depending on number of occurrence of key word

I’m grappling with a regex solution to the following problem: say, I have a series of strings that all contain a number of occurrences of the keyword Appendix or appendix like this:

text <- c("Appendix abc Appendix def appendix final",
          "blah blah Appendix abc Appendix finalissimo")

and I want to delete everything that follows the last occurrence of "Appendix" including the keyword itself to obtain the follwing desired output:

1 Appendix abc Appendix def
2 blah blah Appendix abc 

I know (a) tidyverse solution(s) is/are possible (e.g., Extract all text before the last occurrence of a specific word, but here I’m specifically interested in a regex solution. I’ve tried a number of such regex solutions but none seem to work. The one I thought most promising is this involving negative lookahead and backreference but it too does not produce the desired result:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(stringr)
str_extract(text, "(?i).*(?!(appendix).*\\1)")

I’d be grateful for advice why this solution does not work and for a regex solution that does work.

>Solution :

I would use a regex with lookahead logic here:

text <- c("Appendix abc Appendix def appendix final",
          "blah blah Appendix abc Appendix finalissimo")
output <- sub("(?i)\\s+appendix(?!.*\\bappendix\\b).*", "", text, perl=TRUE)
output

[1] "Appendix abc Appendix def" "blah blah Appendix abc"
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading