I am trying to replace \ with double \ in java. Thing is sometimes the \ belongs to a unicode character (eg:\u00E9) and in this case I don’t want to do the escape.
Currently I am stuck at the actual replacement of . This is just an example that will be a bit refactored :
static boolean isUnicodeChar(String string) {
final String regex = "(?<!\\\\)(\\\\\\\\)*\\\\u[A-Fa-f\\d]{4}";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE | Pattern.UNICODE_CASE);
final Matcher matcher = pattern.matcher(string);
return matcher.find();
}
public static void main(String[] args) {
String string = "Muir-Torr \\ \\u00E9 syndrome \\u1234 skd just some \\uabcd arbitrary text \\ s";
System.out.println("Old String: " + string);
for (int startPosition = string.indexOf('\\'); startPosition >= 0; startPosition = string.indexOf('\\', startPosition + 1)) {
int strLength = string.length();
// int endPosition = startPosition + 5;
int endPosition = startPosition + 6;
System.out.println("\\ found at position: " + startPosition);
if (endPosition <= strLength - 1) {
String stringToCheck = string.substring(startPosition, endPosition);
System.out.println("Checking : " + stringToCheck);
if (!isUnicodeChar(stringToCheck)) {
//here I should replace the char that is found at start position with "\\\\"
System.out.println("New String: " + string);
} else {
System.out.println("No replacement needed, is unicode char: " + stringToCheck);
}
} else {
//here I should replace the char that is found at start position with "\\\\"
System.out.println("New String: " + string);
}
}
}
Replaced string should be:
"Muir-Torr \\\\ \\u00E9 syndrome \\u1234 skd just some \\uabcd arbitrary text \\\\ s";
>Solution :
First let’s just replace all backslashes with 2. The joy of Java escaping…
String replaced = string.replaceAll("\\\\", "\\\\\\\\");
Now we need to tell the regex engine "don’t do that if the slash is followed by u and four hexadecimal digits". We can use a negative lookahead: (?!u[0-9a-fA-F]{4})
String replaced = string.replaceAll("\\\\(?!u[0-9a-fA-F]{4})", "\\\\\\\\");
Result:
Muir-Torr \\ \u00E9 syndrome \u1234 skd just some \uabcd arbitrary text \\ s
You can repeat step one to get your desired result.