Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Looking to parse and copy data for specific pattern in R

I am trying to parse the following strings, where I want to copy the accession number in the beginning of the string to add after the "];" and before the "dxPhospho", please note that the number before the "x" can be any number and why I am calling it "d" here. This pattern of "]; dPhospho" is what I need to match.

# Sample dataframe
DT <- data.frame(Positions.in.Master.Proteins = c("Q8R149 2xPhospho [T131(100); T/S]; 2xPhospho [T157(100); T/S]",
                                                  "Q9UET0 3xPhospho [S23(90); T63(70); Y67(70)]; 3xPhospho]"))

The output would look like this;

[1] "Q8R149 2xPhospho [T131(100); T/S]; **Q8R149** 2xPhospho [T157(100); T/S]"

[2] "Q9UET0 3xPhospho [S23(90); T63(70); Y67(70)]; **Q9UET0** 3xPhospho]"

where you can now see that the accession numbers are copied to where I need them to be. Thanks!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

With the package gsubfn, you can extract your accession number with sub, and treat it as the replacement directly.

library(gsubfn)

unname(
  sapply(DT$Positions.in.Master.Proteins, 
         \(i) gsubfn(pattern = "; \\dxPhospho", 
                     replacement = \(x) paste0("; ", sub(" \\[.*", "", i)), 
                     x = i))
  )

[1] "Q8R149 2xPhospho [T131(100); T/S]; Q8R149 2xPhospho [T157(100); T/S]"
[2] "Q9UET0 3xPhospho [S23(90); T63(70); Y67(70)]; Q9UET0 3xPhospho]"  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading