Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex should not match if special character found anywhere in the string

Please help me!

I am parsing strings which contain weights.
But here is the catch: some strings contain range (see line 3 of that example below), which I consider an ambiguous value and do not want to match at all.

examples are:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

1.0kg - should return group(1)='1.0', group(2)='kg'
400.00g - should return group(1)='400.00', group(2)='g'
100-800g - right now returns group(1)='800', group(2)='g', but should not return match!

Regex I am using right now is:

r"([\d.,]+)(g|kg)"

How to modify it to exclude 3rd line from returning a match?

Right now I check if string contains ‘-‘ before using a regex, but I wonder how to do it using a regex patter without extra if-else statements.

>Solution :

You may use the following regex pattern:

(?<!-)\b\d+(?:\.\d+)?\wg

This pattern excludes numbers which are immediately preceded by a dash, while still also requiring that the matching number is bounded on the left by a word boundary.

Explanation:

  • (?<!-) assert that hyphen does not preceded (eliminate 100-800g)
  • \b but still match a word boundary
  • \d+ match an integer
  • (?:\.\d+)? optional decimal component
  • \w single letter unit in front of grams
  • g match ‘g’ for grams

Here is a working demo.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading