Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex to extract a number from a text

I’m trying to extract a 9-digit-reference from a text. This reference always starts with a 2 or a 5.

Example: Hello. My reference is 233445566.
Output: 233445566

I’ve been using the expression ([2,5][0-9]\w{7,7}) and it works. However if the sentence is "Hello. My phone number is 6233445566." the output is also ‘233445566’, and I don’t want that. In this scenario, the expression shouldn’t return anything.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Any idea of how I can avoid this problem?

Thanks!

>Solution :

You can use a word boundary to make sure that the matched reference is not part of a larger number. A word boundary \b matches the position between a word character (as defined by \w) and a non-word character (as defined by \W), or between the start/end of the string and a word character or a non-word character.

Here is the modified regular expression that includes the word boundary \b at the beginning and end: \b([25][0-9]\w{7})\b

This regular expression matches a string that starts with 2 or 5, followed by seven word characters, and ends at a word boundary. The length of the reference is exactly 9 characters, as required.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading