Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regular expression for extracting Russian passport numbers

I need regular extraction that extract passport number after specific word паспорт .

Possible options are:

  • паспорт 5715 424141
  • паспорт 5715-424141
  • паспорт 5715 - 424141

I need to extract first 4 and last 6 numbers after word паспорт occurred, so result should be 5715 and 424141.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried ^(\d{4})\ (\d{6})$ but it’s not detected my pattern.

>Solution :

For starters, the ^ symbol means the start of the string, so that already fails your pattern (as the strings start with "паспорт").

It also seems that the - between the number groups is optional and may have spaces which you don’t support.

To fix all those issues, use:

паспорт (\d{4})\s*-?\s*(\d{6})
  • паспорт – literal match.
  • (\d{4}) – a capture group of four digits.
  • \s* – any number of spaces, including 0.
  • -? – an optional dash.
  • \s* – any number of spaces, including 0.
  • (\d{6}) – a capture group of six digits.

And since you tagged with Python:

import re

s = """паспорт 5715 424141
паспорт 5715-424141
паспорт 5715 - 424141"""

for line in s.splitlines():
    print(re.search(r"паспорт (\d{4})\s*-?\s*(\d{6})", line).groups())
# ('5715', '424141')

Regex demo

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading