Follow

Follow

Contact

Home Not more than one special symbol in a range from a long text

Questions

Not more than one special symbol in a range from a long text

byMR

January 31, 2024

Simplify the problem:

There is an article (long text)

Extract the content between start (included) and end (included)

Requirement: There cannot be more than one \n between start and end

Find all matches

Use python re only

For code:

lines = re.findall(pattern, text, re.DOTALL)
for line in lines:
    print(line)
    print('===')

So, how can I fixed my pattern?

What I try pattern:

start[^\n]*\n?[^\n]*end
with text:

...
start just me and python regex 1 end
start just me and python regex 2 end
start just me and python regex 3 end
...

wrong:

start just me and python regex 1 end
start just me and python regex 2 end --> should be split with the line before
===
start just me and python regex 3 end
===

start(?:(?!\n\n).)*?end and start(?:[^\n]|\n(?!\n))*?end
with text:

start just 
me and python 
regex 1 end
start just me and python regex 2 end
start just me and python regex 3 end

wrong:

start just 
me and python 
regex 1 end --> should not match this cause there is two `\n` in
===
start just me and python regex 2 end
===
start just me and python regex 3 end
===

>Solution :

you can use: start[^\n]*?\n?[^\n]*?end

python-re

byMR

Published January 31, 2024

Add a comment

Leave a ReplyCancel reply

Read more

Questions

PrimeNG Calendar Issue

byMR

January 31, 2024

Questions

group clock in and clock out attendance query

byMR

January 31, 2024

Questions

How to create (explode) separate rows using two array columns in presto/SQL

byMR

January 31, 2024

Questions

In Oracle, how to delete a row with a specific value, only if it is a duplicate?

byMR

January 31, 2024

Questions

Vuetify rules message translation using nuxt i18n

byMR

January 31, 2024

Questions

Separating a text variable into binary variables in R

byMR

January 31, 2024