Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Need to pad numbers inside a semicolon separated vector in r

I have the following dataframe, and I need to manipulate column a to get to column a_clean:

df=data.frame(a=c("1234-12;23456-123","12345-1234",NA,"1234-013;1234-014"),a_clean=c("01234-0012;23456-0123","12345-1234",NA,"1234-0013;1234-0014"))

I need to pad the numbers before the hyphen so it’s five digits and after the hyphen so it’s 4 digits.

I don’t want to separate a to different rows, and then concat back together. My dataframe is very big and I want to do the string manipulation as fast as possible.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

gsubfn is like gsub except the replacement argument is a function which inputs the capture groups (matches to the portions of the regular expression within parentheses) as separate arguments. The entire match is then replaced with the output of the function. This matches each of the strings of digits and then passes them as x and y to the function expressed in formula notation where they are converted to numeric and sprintf adds 0’s.

If you are using dplyr replace transform with mutate.

library(gsubfn)

transform(df, clean = 
  gsubfn("(\\d+)-(\\d+)", ~ sprintf("%05d-%04d", as.numeric(x), as.numeric(y)), a))

giving

                  a               a_clean                 clean
1 1234-12;23456-123 01234-0012;23456-0123 01234-0012;23456-0123
2        12345-1234            12345-1234            12345-1234
3              <NA>                  <NA>                    NA
4 1234-013;1234-014   1234-0013;1234-0014 01234-0013;01234-0014
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading