I have a data such as sequence of string where text and number type alternate: e.g. VID22CAS05, TEL21XSE12 and I need to check the length of items after parsing, e.g. VID22CAS05 -> VID 22 CAS 05 => length of 4.
data<-c("VID22CAS05", "TEL21XSE12")
string_lengths<-purrr::map(data, function(x){
x_sep<-trimws(x=gsub("(\\d+|[A-Za-z]+)", "\\1 ", x)), which="both"
length<-strsplit(x_sep, " ")[[1]]
})
This works fine but the problem is that this is very slow for huge dataset.
Is there any way, how to speed this up?
>Solution :
Will this do?
lengths(gregexpr('\\d+|[a-zA-Z]+', data))
# [1] 4 4