Matcher.replaceAll() removes backslash even when I escape it. Java

Advertisements

I have a functiolnality in my app which shoud replace some text in json (I have simplified it in example). There replacement may contains escaping sequence like \n \b \t etc. which can broke the json string when I try to built json with Jackson. So I decided to use Apache’s solution – StringEscapeUtils.escapeJava() to esape all escaping sequence. But
Matcher.replaceAll() removes backslashes which added by escapeJava()

There is the code:

public static void main(String[] args) {
    String json = "{\"test2\": \"Hello toReplace \\\"test\\\" world\"}";

    String replacedJson = Pattern.compile("toReplace")
            .matcher(json)
            .replaceAll(StringEscapeUtils.escapeJava("replacement \n \b \t"));

    System.out.println(replacedJson);
}

Expected Output:

{"test2": "Hello replacement \n \b \t \"test\" world"}

Actual Output:

{"test2": "Hello replacement n b t \"test\" world"}

Why does Matcher.replaceAll() removes backslahes while System.out.println(StringEscapeUtils.escapeJava("replacement \n \b \t")); returns correct output – replacement \n \b \t

>Solution :

String input = "\n";

StringEscapeUtils.escapeJava(input) allows you to transform the above single newline character into two characters: \ and n.

\ is a special character in pattern replacements though, from https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#replaceAll(java.lang.String):

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

To have them taken as literal characters, you need to escape it via Matcher.quoteReplacement, from https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#quoteReplacement(java.lang.String):

Returns a literal replacement String for the specified String. This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class. The String produced will match the sequence of characters in s treated as a literal sequence. Slashes (\) and dollar signs ($) will be given no special meaning.

So in your case:

.replaceAll(Matcher.quoteReplacement(StringEscapeUtils.escapeJava("replacement \n \b \t")))

Leave a ReplyCancel reply