Which regular expression do I have to implement to extract text between two lines containing a string and an arbitrary number of digits?

Advertisements

That’s the code I have:

text = 'LIBRO 1\ndsfsdf\nasdas\nfgfghf\nLIBRO 21\nhghj\nghjhjk\nghjhk\nLIBRO 333'

result = re.findall(r'(?<=LIBRO \d+\n)(.*?)(?=\nLIBRO)', text, re.DOTALL)
print(result)

and this is the error I get:

re.error: look-behind requires fixed-width pattern

the desired result is:

['dsfsdf\nasdas\nfgfghf', 'nhghj\nghjhjk\nghjhk']

>Solution :

You could use split instead of findall, removing the empty entries in the results, as there would be a result for what comes before the first LIBRO:

result = [s.strip() for s in re.split(r'(?m)^LIBRO \d+$', text) if s]

Leave a Reply Cancel reply