Home Regular expression for Python 3 to catch numbers within a text file

Questions

Regular expression for Python 3 to catch numbers within a text file

December 2, 2021

I have a text file with entries like:

2: Adcock R, Cuzick J, Hunt WC, McDonald RM, Wheeler CM; New Mexico HPV Pap
Registry Steering Committee. Role of HPV Genotype, Multiple Infections, and
Viral Load on the Risk of High-Grade Cervical Neoplasia. Cancer Epidemiol
Biomarkers Prev. 2019 Nov;28(11):1816-1824. doi: 10.1158/1055-9965.EPI-19-0239.
Epub 2019 Sep 5. PMID: 31488417; PMCID: PMC8394698.

3: Castle PE, Adcock R, Cuzick J, Wentzensen N, Torrez-Martinez NE, Torres SM,
Stoler MH, Ronnett BM, Joste NE, Darragh TM, Gravitt PE, Schiffman M, Hunt WC,
Kinney WK, Wheeler CM; New Mexico HPV Pap Registry Steering Committee; p16 IHC
Study Panel. Relationships of p16 Immunohistochemistry and Other Biomarkers With
Diagnoses of Cervical Abnormalities: Implications for LAST Terminology. Arch
Pathol Lab Med. 2020 Jun;144(6):725-734. doi: 10.5858/arpa.2019-0241-OA. Epub
2019 Nov 13. PMID: 31718233; PMCID: PMC8575174.

I want to create a python program that will allow me to extract all those numbers that follow "PMID:"

I tried:

import re
path = 'summaryCosetteWheset.txt'
pmidsFile = open(path, 'r')
info = pmidsFile.read()
print(info)

pmidsList = re.findall(r'PMID: (\d)+;', info)

print(pmidsList)

But I am only getting digits not numbers like 31718233. Is there a way to do this? Thanks

PD: Just started with Python3

>Solution :

You need to move the + inside the capturing group to capture all the digits in each match.

pmidsList = re.findall(r'PMID: (\d+);', info)

With the + outside the capturing group, only the last digit matched in each group will be retained.

regex

byMR

Published December 02, 2021

Add a comment

Is the ordering of a GROUP BY with a MAX aggregate well defined?

byMR

December 2, 2021

Questions

I have a problem with an UPDATE BEFORE TRIGGER

byMR

December 2, 2021

Questions

Iterate for sibling elements and filter out specific elements at the same time

byMR

December 2, 2021

Questions

R: how to abind many arrays, using names stored in a character vector

byMR

December 2, 2021

Questions

Why iteration linking works that way in Linked List

byMR

December 2, 2021

Questions

How do I display more rows of a Dataframe in Jupyter Notebook?

byMR

December 2, 2021

Regular expression for Python 3 to catch numbers within a text file

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Is the ordering of a GROUP BY with a MAX aggregate well defined?

I have a problem with an UPDATE BEFORE TRIGGER

Iterate for sibling elements and filter out specific elements at the same time

R: how to abind many arrays, using names stored in a character vector

Why iteration linking works that way in Linked List

How do I display more rows of a Dataframe in Jupyter Notebook?

Keep Up to Date with the Most Important News

Regular expression for Python 3 to catch numbers within a text file

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Is the ordering of a GROUP BY with a MAX aggregate well defined?

I have a problem with an UPDATE BEFORE TRIGGER

Iterate for sibling elements and filter out specific elements at the same time

R: how to abind many arrays, using names stored in a character vector

Why iteration linking works that way in Linked List

How do I display more rows of a Dataframe in Jupyter Notebook?

Discover more from Dev solutions