Java regex splitting, but only removing one whitespace

I have this code:

String[] parts = sentence.split("\\s");

and a sentence like: "this is a whitespace and I want to split it" (note there are 3 whitespaces after "whitespace")

I want to split it in a way, where only the last whitespace will be removed, keeping the original message intact. The output should be

"[this], [is], [a], [whitespace ], [and], [I], [want], [to], [split], [it]"
(two whitespaces after the word "whitespace")

Can I do this with regex and if not, is there even a way?

I removed the + from \\s+ to only remove one whitespace

>Solution :

You can use

String[] parts = sentence.split("\\s(?=\\S)");

That will split with a whitespace char that is immediately followed with a non-whitespace char.

See the regex demo. Details:

  • \s – a whitespace char
  • (?=\S) – a positive lookahead that requires a non-whitespace char to appear immediately to the right of the current location.

To make it fully Unicode-aware in Java, add the (?U) (Pattern.UNICODE_CHARACTER_CLASS option equivalent) embedded flag option: .split("(?U)\\s(?=\\S)").

Leave a Reply