Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to mask a string based on a pattern of string of same length

I have the following set of string:

core_string     <- "AFFVQTCRE"
mask_string     <- "*KKKKKKKK"

What I want to do is to mask core_string with mask_string.
Whenever the * coincide with character in core_string, we will keep that character,
otherwise replace it.

So the desired result is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   AKKKKKKKK

Other example

core_string     <- "AFFVQTCRE"
mask_string     <- "*KKKK*KKK"
 #     result       AKKKKTKKK

The length of both strings is always the same.
How can I do that with R?

>Solution :

Here’s a helper function that will do just that

apply_mask <- function(x, mask) {
  unlist(Map(function(z, m) {
    m[m=="*"]  <- z[m=="*"]
    paste(m, collapse="")
  }, strsplit(x, ""), strsplit(mask, "")))
}

basically you just split up the string into characters and replace the characters that have a "*" then paste the strings back together.

I used the Map to make sure the function is still vectorized over the inputs. For example

core_string     <- c("AFFVQTCRE", "ABCDEFGHI")
mask_string     <- "*KKKK*KKK"

apply_mask(core_string, mask_string)
# [1] "AKKKKTKKK" "AKKKKFKKK"
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading