Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regular expression for Python 3 to catch numbers within a text file

I have a text file with entries like:

2: Adcock R, Cuzick J, Hunt WC, McDonald RM, Wheeler CM; New Mexico HPV Pap
Registry Steering Committee. Role of HPV Genotype, Multiple Infections, and
Viral Load on the Risk of High-Grade Cervical Neoplasia. Cancer Epidemiol
Biomarkers Prev. 2019 Nov;28(11):1816-1824. doi: 10.1158/1055-9965.EPI-19-0239.
Epub 2019 Sep 5. PMID: 31488417; PMCID: PMC8394698.

3: Castle PE, Adcock R, Cuzick J, Wentzensen N, Torrez-Martinez NE, Torres SM,
Stoler MH, Ronnett BM, Joste NE, Darragh TM, Gravitt PE, Schiffman M, Hunt WC,
Kinney WK, Wheeler CM; New Mexico HPV Pap Registry Steering Committee; p16 IHC
Study Panel. Relationships of p16 Immunohistochemistry and Other Biomarkers With
Diagnoses of Cervical Abnormalities: Implications for LAST Terminology. Arch
Pathol Lab Med. 2020 Jun;144(6):725-734. doi: 10.5858/arpa.2019-0241-OA. Epub
2019 Nov 13. PMID: 31718233; PMCID: PMC8575174.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I want to create a python program that will allow me to extract all those numbers that follow "PMID:"

I tried:

import re
path = 'summaryCosetteWheset.txt'
pmidsFile = open(path, 'r')
info = pmidsFile.read()
print(info)

pmidsList = re.findall(r'PMID: (\d)+;', info)

print(pmidsList)

But I am only getting digits not numbers like 31718233. Is there a way to do this? Thanks

PD: Just started with Python3

>Solution :

You need to move the + inside the capturing group to capture all the digits in each match.

pmidsList = re.findall(r'PMID: (\d+);', info)

With the + outside the capturing group, only the last digit matched in each group will be retained.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading