Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

PHP: Check for characters in the Latin script plus spaces and numbers

I am new to regex and I have been going round and round on this problem.

PHP: Check alphabetic characters from any latin-based language? gives the brilliant regex to check for any characters in the Latin script, which is part of what I need.

^\p{Latin}+$

and provides a working example at https://regex101.com/r/I5b2mC/1

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

If I use the regex in PHP by using

echo preg_match('/^\p{Latin}+$/', $testString);

and $testString contains only Latin letters, the output will be 1. If there is any non-Latin letters, the output will be 0. Brilliant.

To add numbers in I tried ^\p{Latin}+[[:alnum:]]*$ but that allows any characters in the Latin script OR non-Latin letters and numbers (letters without accents — grave, acute, cedilla, umlaut etc.) as it is the equivalent to [a-zA-Z0-9].

If you add any numbers with characters in the Latin script, echo preg_match('/^\p{Latin}+[[:alnum:]]*$/', $testString); returns a 0. All numbers return a 0 too. This can be confirmed by editing the expression in https://regex101.com/r/I5b2mC/1

How do I edit the expression in echo preg_match('/^\p{Latin}+$/', $testString); to output a 1 if there are any characters in the Latin script, any numbers and/or spaces in $testString? For example, I wish for a 1 to be output if $testString is Café ßüs 459.

>Solution :

There are at least two things to change:

  • Add u flag to support chars other than ASCII (/^\p{Latin}+$/ => /^[\p{Latin}]+$/u)
  • Create a character class for letters, digits and whitespace patterns (/^\p{Latin}+$/u => ^[\p{Latin}]+$/u)
  • Then add the digit and whitespace patterns. If you need to support any Unicode digits, add \d. If you need to support only ASCII digits, add 0-9.

Thus, you can use

preg_match('/^[\p{Latin}\s0-9]+$/u', $testString) // ASCII only digits
preg_match('/^[\p{Latin}\s\d]+$/u', $testString)  // Any digits

Also, \s with u flag will match any Unicode whitespace chars.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading