Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Invalid regular expression – Invalid property name in character class

I am using a fastify server, containing a typescript file that calls a function, which make sure people won’t send unwanted characters. Here is the function :

const SAFE_STRING_REPLACE_REGEXP = /[^\p{Latin}\p{Zs}\p{M}\p{Nd}\-\'\s]/gu;
function secure(text:string) {
  return text.replace(SAFE_STRING_REPLACE_REGEXP, "").trim();
}

But when I try to launch my server, I got an error message :
"Invalid regular expression – Invalid property name in character class".

It used to work just fine with my previous regex :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

const SAFE_STRING_REPLACE_REGEXP = /[^0-9a-zA-ZàáâäãåąčćęèéêëėįìíîïłńòóôöõøùúûüųūÿýżźñçčšžÀÁÂÄÃÅĄĆČĖĘÈÉÊËÌÍÎÏĮŁŃÒÓÔÖÕØÙÚÛÜŲŪŸÝŻŹÑßÇŒÆČŠŽ∂ð\-\s\']/g;
function secure(text:string) {
  return text.replace(SAFE_STRING_REPLACE_REGEXP, "").trim();
}

But I have been told it wasn’t optimized enough. I have also been told it’s better to use split/join than regex/replace in matter of performances, but I don’t know if I can use it in my case.

>Solution :

You need to use

const SAFE_STRING_REPLACE_REGEXP = /[^\p{Script=Latin}\p{Zs}\p{M}\p{Nd}'\s-]/gu;
// or
const SAFE_STRING_REPLACE_REGEXP = /[^\p{sc=Latin}\p{Zs}\p{M}\p{Nd}'\s-]/gu;

You need to prefix scripts with sc= or Script= in Unicode category classes, so \p{Latin} should be specified as \p{Script=Latin}. See the ECMAScript reference.

Also, when you use the u flag, you cannot escape non-special chars, so do not escape ' and better move the - char to the end of the character class.

You can use split&join, too:

const SAFE_STRING_REPLACE_REGEXP = /[^\p{Script=Latin}\p{Zs}\p{M}\p{Nd}'\s-]/u;
console.log("Ącki-Łał русский!!!中国".split(SAFE_STRING_REPLACE_REGEXP).join(""))

Note you don’t need the g modifier with split, it is the default behavior.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading