Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why does this regex not match my groups in python?

I have the following complete code example

import re

examples = [
    "D1",       # expected: ('1')
    "D1sjdgf",  # ('1')
    "D1.2",     # ('1', '2')
    "D1.2.3",   # ('1', '2', '3')
    "D3.10.3x", # ('3', '10', '3')
    "D3.10.11"  # ('3', '10', '11')
]

for s in examples:
    result = re.search(r'^D(\d+)(?:\.(\d+)(?:\.(\d+)))', s)
    print(s, result.groups())

where I want to match the 1, 2 or 3 numbers in the expression always starting with the letter "D". It could be 1 of them, or 2, or three. I am not interested in anything after the last digit.

I would expect that my regex would match e.g. D3.10.3x and return ('3','10','3'), but instead returns only ('3',). I do not understand why.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

^D(\d+\)(?:\.(\d+)(?:\.(\d+)))

  • ^D matches "D" at the start
  • \d matches the first one-digit number inside a group.
  • (?: starts a non-matching group. I do not want to get this group back.
  • \. A literal point
  • (\d+) A group of one or more numbers I want to "catch"

I also do not know what a "non-capturing" group means in that context as for this answer.

>Solution :

You may use this regex solution with a start anchor and 2 capture groups inside the nested optional capture groups:

^D(\d+)(?:\.(\d+)(?:\.(\d+))?)?

RegEx Demo

Explanation:

  • ^: Start
  • D: Match letter D
  • (\d+): Match 1+ digits in capture group #1
  • (?:: Start outer non-capture group
    • \.: Match a dot
    • (\d+): Match 1+ digits in capture group #2
    • (?:: Start inner non-capture group
      • \.: Match a dot
      • (\d+): Match 1+ digits in capture group #3
    • )?: End inner optional non-capture group
  • )?: End outer optional non-capture group

Code Demo:

import re

examples = [
    "D1",       # expected: ('1')
    "D1sjdgf",  # ('1')
    "D1.2",     # ('1', '2')
    "D1.2.3",   # ('1', '2', '3')
    "D3.10.3x", # ('3', '10', '3')
    "D3.10.11"  # ('3', '10', '11')
]

rx = re.compile(r'^D(\d+)(?:\.(\d+)(?:\.(\d+))?)?')

for s in examples:
    result = rx.search(s)
    print(s, result.groups())

Output:

D1 ('1', None, None)
D1sjdgf ('1', None, None)
D1.2 ('1', '2', None)
D1.2.3 ('1', '2', '3')
D3.10.3x ('3', '10', '3')
D3.10.11 ('3', '10', '11')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading