Home Python regex – Extract all the matching text between two patterns

Questions

Python regex – Extract all the matching text between two patterns

September 28, 2022

I want to extract all the text in the bullet points numbered as 1.1, 1.2, 1.3 etc. Sometimes the bullet points can have space like 1. 1, 1. 2, 1 .3, 1 . 4

Sample text

    text = "some text before pattern 1.1 text_1_here  1.2 text_2_here  1 . 3 text_3_here  1. 4 text_4_here  1 .5 text_5_here 1.10 last_text_here 1.23 text after pattern"

For the text above, the output should be
[‘ text_1_here ‘, ‘ text_2_here ‘, ‘ text_3_here ‘, ‘ text_4_here ‘, ‘ text_5_here ‘, ‘ last_text_here ‘]

I tried regex findall but not getting the required output. It is able to identify and extract 1.1 & 1.2 and then 1.3 & 1.4. It is skipping text between 1.2 & 1.3.

    import re
    re.findall(r'[0-9].\s?[0-9]+(.*?)[0-9].\s?[0-9]+', text)

>Solution :

I’m unsure about the exact rule why you’d want to exclude the last bit of text but based on your comments it seems we could also just split the entire text on the bullits and simply exclude the 1st and last element from the resulting array:

re.split(r'\s+\d(?:\s*\.\s*\d+)+\s+', text)[1:-1]

Which would output:

['text_1_here', 'text_2_here', 'text_3_here', 'text_4_here', 'text_5_here', 'last_text_here']

text-mining

byMR

Published September 28, 2022

Add a comment

Fill out numbers in list by sequence in Excel

byMR

September 28, 2022

Questions

passing external date variable to a query python

byMR

September 28, 2022

Questions

Python: Are instance methods first-class objects?

byMR

September 28, 2022

Questions

C# Automapper AfterMap property in a collection

byMR

September 28, 2022

Questions

How can I access the values of the props in my component?

byMR

September 28, 2022

Questions

Cant use margin command in HTML

byMR

September 28, 2022

Python regex – Extract all the matching text between two patterns

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Fill out numbers in list by sequence in Excel

passing external date variable to a query python

Python: Are instance methods first-class objects?

C# Automapper AfterMap property in a collection

How can I access the values of the props in my component?

Cant use margin command in HTML

Keep Up to Date with the Most Important News

Python regex – Extract all the matching text between two patterns

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Fill out numbers in list by sequence in Excel

passing external date variable to a query python

Python: Are instance methods first-class objects?

C# Automapper AfterMap property in a collection

How can I access the values of the props in my component?

Cant use margin command in HTML

Discover more from Dev solutions