Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extract a string between two letters and create a new column in R dplyr

I have a data.frame that looks like this

library(tidyverse)
df1 <- tibble(genes=c("AT1G02205","AT1G02160","AT5G02160", "ATCG02160"))
df1
#> # A tibble: 4 × 1
#>   genes    
#>   <chr>    
#> 1 AT1G02205
#> 2 AT1G02160
#> 3 AT5G02160
#> 4 ATCG02160

Created on 2022-10-19 with reprex v2.0.2

and I want to extract anything between the letters A and T and create a new column so my new.df looks like

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#>   genes         chr
#>   <chr>    
#> 1 AT1G02205     Chr1
#> 2 AT1G02160     Chr1
#> 3 AT5G02160     Chr5
#> 4 ATCG02160     ChrC

So far, I have found a nasty way to do this, but I am sure I could have done better.

``` r
library(tidyverse)
df1 <- tibble(genes=c("AT1G02205","AT1G02160","AT5G02160", "ATCG02160"))

new.df <-  df1 |> 
  mutate(chr=str_extract(genes, "T(.*?)G"))  |> 
  mutate(chr=str_replace_all(chr, c("T"="", "G"=""))) |> 
  mutate(chr=paste0("Chr",chr))
new.df
#> # A tibble: 4 × 2
#>   genes     chr  
#>   <chr>     <chr>
#> 1 AT1G02205 Chr1 
#> 2 AT1G02160 Chr1 
#> 3 AT5G02160 Chr5 
#> 4 ATCG02160 ChrC

Created on 2022-10-19 with reprex v2.0.2

>Solution :

You can use str_match:

library(stringr)
library(dplyr)
df1 %>% 
  mutate(chr = str_c("Chr", str_match(genes, "T(.*)G")[, 2]))

#   genes     chr  
# 1 AT1G02205 Chr1 
# 2 AT1G02160 Chr1 
# 3 AT5G02160 Chr5 
# 4 ATCG02160 ChrC

Or in base R with gsub:

df1 |>
  transform(chr = paste0("Chr", gsub(".*T(.*)G.*", '\\1', genes)))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading