Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split a string between a word and a number

I have some text like the following:

foo_text <- c(
  "73000 PARIS   74000 LYON",
  "75 000 MARSEILLE 68483 LILLE",
  "60  MARSEILLE 68483 LILLE"
)

I’d like to separate each element in two after the first word. Expected output:

"73000 PARIS" "74000 LYON" "75000 MARSEILLE" "68483 LILLE" "60 MARSEILLE" "68483 LILLE"

Note that the number of spaces between two elements in the original text is not necessarily the same (e.g the number of spaces between PARIS and 74000 is not the same than the number of spaces between MARSEILLE and 68483). Also, sometimes the first number has a space in it (e.g 75 000) and sometimes not (e.g 73000).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried to adapt this answer but without success:

(delimitedString = gsub( "^([a-z]+) (.*) ([a-z]+)$", "\\1,\\2", foo_text))

Any idea how to do that?

>Solution :

We can try using strsplit here as follows:

foo_text <- c(
    "73000 PARIS   74000 LYON",
    "75 000 MARSEILLE 68483 LILLE",
    "60  MARSEILLE 68483 LILLE"
)
output <- unlist(strsplit(foo_text, "(?<=[A-Z])\\s+(?=\\d)", perl=TRUE))
output

[1] "73000 PARIS"      "74000 LYON"       "75 000 MARSEILLE" "68483 LILLE"
[5] "60  MARSEILLE"    "68483 LILLE"

The regex pattern used here says to split when:

(?<=[A-Z])  what precedes is an uppercase letter
\\s+        split (and consume) on one or more whitespace characters
(?=\\d)     what follows is a digit
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading