Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to enable regmatches to work in dplyr's mutate

I have the following function, which basically replace the ? with replacement string bb_seq.

library(tidyverse)
replace_bb_with_str <- function (seed_pattern = NULL, bb_seq = NULL) {

  sp <- seed_pattern
  gr   <- gregexpr("\\?+", sp)
  csml <- lapply(gr, function(sp) cumsum(attr(sp, "match.length")))
  regmatches(sp, gr) <- lapply(csml, function(sp) substring(bb_seq, c(1, sp[1]), sp))
  sp
  
}

It works well with single run:

plist <- c(
  "??????????DRHRTRHLAK??????????",
  "????????????????????TRCYHIDPHH",
  "FKDHKHIDVK????????????????????TRCYHIDPHH",
  "FKDHKHIDVK????????????????????"
)

replace_bb_with_str(seed_pattern = plist[1], bb_seq =  "ndqeegillkkkkfpssyvv")
# [1] "ndqeegillkDRHRTRHLAKkkkkfpssyvv"

But when I run it with dplyr::mutate :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

expand.grid(seed_pattern = plist, bb_seq =  "ndqeegillkkkkfpssyvv") %>%
  rowwise() %>%
  mutate(nseq = replace_bb_with_str(seed_pattern = seed_pattern, bb_seq = bb_seq)) 

I got this error:

Error in `mutate()`:
! Problem while computing `nseq = replace_bb_with_str(seed_pattern =
  seed_pattern, bb_seq = bb_seq)`.
ℹ The error occurred in row 1.
Caused by error in `nchar()`:
! 'nchar()' requires a character vector

How can I resolve this issue?

>Solution :

expand.grid() coerces character vectors to factors, which don’t play nicely with your function. tidyr::expand_grid() preserves input types, so your function works fine:

library(tidyr)

expand_grid(seed_pattern = plist, bb_seq =  "ndqeegillkkkkfpssyvv") %>% 
  rowwise() %>%
  mutate(nseq = replace_bb_with_str(seed_pattern = seed_pattern, bb_seq = bb_seq)) 
# A tibble: 4 × 3
# Rowwise: 
  seed_pattern                             bb_seq               nseq            
  <chr>                                    <chr>                <chr>           
1 ??????????DRHRTRHLAK??????????           ndqeegillkkkkfpssyvv ndqeegillkDRHRT…
2 ????????????????????TRCYHIDPHH           ndqeegillkkkkfpssyvv ndqeegillkkkkfp…
3 FKDHKHIDVK????????????????????TRCYHIDPHH ndqeegillkkkkfpssyvv FKDHKHIDVKndqee…
4 FKDHKHIDVK????????????????????           ndqeegillkkkkfpssyvv FKDHKHIDVKndqee

Note that at least with your example data, there’s actually no need to use expand_grid() (instead of data.frame() or tibble()). Or rowwise() — you’d get the same output without it.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading