I have a functiolnality in my app which shoud replace some text in json (I have simplified it in example). There replacement may contains escaping sequence like \n \b \t
etc. which can broke the json string when I try to built json with Jackson. So I decided to use Apache’s solution – StringEscapeUtils.escapeJava()
to esape all escaping sequence. But
Matcher.replaceAll() removes backslashes which added by escapeJava()
There is the code:
public static void main(String[] args) {
String json = "{\"test2\": \"Hello toReplace \\\"test\\\" world\"}";
String replacedJson = Pattern.compile("toReplace")
.matcher(json)
.replaceAll(StringEscapeUtils.escapeJava("replacement \n \b \t"));
System.out.println(replacedJson);
}
Expected Output:
{"test2": "Hello replacement \n \b \t \"test\" world"}
Actual Output:
{"test2": "Hello replacement n b t \"test\" world"}
Why does Matcher.replaceAll()
removes backslahes while System.out.println(StringEscapeUtils.escapeJava("replacement \n \b \t"));
returns correct output – replacement \n \b \t
>Solution :
String input = "\n";
StringEscapeUtils.escapeJava(input)
allows you to transform the above single newline character into two characters: \
and n
.
\
is a special character in pattern replacements though, from https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#replaceAll(java.lang.String):
Note that backslashes (
\
) and dollar signs ($
) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
To have them taken as literal characters, you need to escape it via Matcher.quoteReplacement
, from https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#quoteReplacement(java.lang.String):
Returns a literal replacement
String
for the specifiedString
. This method produces aString
that will work as a literal replacements
in theappendReplacement
method of theMatcher
class. TheString
produced will match the sequence of characters ins
treated as a literal sequence. Slashes (\
) and dollar signs ($
) will be given no special meaning.
So in your case:
.replaceAll(Matcher.quoteReplacement(StringEscapeUtils.escapeJava("replacement \n \b \t")))