If i have a string like
b*&^6bolyb{[--9_(marker1JN9&[7&9bkey=-+)*.,mljmarker2,pi*[80[)(Mmp0oiymarker1ojm)*[marker2,;i0m980m.9u090marker1*(7hp0Key0()mu90marker2
how do i extract the part between marker1 and marker2 if it contains key (or ‘Key’ or any other variation in case) ?
So i’d like to have the code return:
['JN9&[7&9bkey=-+)*.,mlj', '*(7hp0Key0()mu90']
>Solution :
We can use re.findall here:
inp = "b*&^6bolyb{[--9_(marker1JN9&[7&9bkey=-+)*.,mljmarker2,pi*[80[)(Mmp0oiymarker1ojm)*[marker2,;i0m980m.9u090marker1*(7hp0key0()mu90marker2"
matches = re.search(r'marker1(?:(?!marker[12]).)*key(?:(?!marker[12]).)*marker2', inp)
print(matches) # ['marker1JN9&[7&9bkey=-+)*.,mljmarker2', 'marker1*(7hp0key0()mu90marker2']
The regex pattern used above ensures that we match a marker1 ... key ... marker2 sequence without crossing over more than one marker1 or marker2 boundary:
marker1match "marker1"(?:(?!marker[12]).)*match any content WITHOUT crossing a "boundary1" or "boundary2" markerkeymatch "key"(?:(?!marker[12]).)*again match without crossing a markermarker2match "marker2"