Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove special word using gsub

I try to clean some text and I would like to remove the following text from a string

googletag.cmd.push(function() {
googletag.display(‘div-gpt-ad-1513202928332-3’); });

For example, if

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

x="123 googletag.cmd.push(function() { googletag.display('div-gpt-ad-1513202928332-3'); }); 456"

then

gsub("googletag.cmd.push(function() { googletag.display('div-gpt-ad-1513202928332-3'); });, ", x)

The desired output is [1] 123456

Thank you

>Solution :

Regex approach

You can use the following pattern.

x <- "123 googletag.cmd.push(function() { googletag.display('div-gpt-ad-1513202928332-3'); }); 456"

gsub("^(\\d+).*?(\\d+)$", "\\1\\2", x)
# [1] "123456"

Explanation:

enter image description here

We keep the groups of digits at the start and end (groups 1 and 2) and discard everything in between. We use a non-greedy regex in between to ensure we capture all digits in both groups.

Non-regex approach

It’s a little difficult to tell with one example, but if it’s always the number at the beginning and the end of the string, you don’t need regex. You can just split on spaces and take the first and last element:

strsplit(x, " ", fixed = TRUE) |>
    sapply(\(m) paste0(head(m, 1), tail(m, 1)))
# [1] "123456"
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading