I need to get only ‘ab’ characters combination, without ‘a’ and ‘b’ only:
for example:
from ‘aaaaabbbbsaaaaaaaa’
I need to get only ‘aaaaabbbb’ part.
I tried
'[ab]+'
pattern, but it gives aaaaa combinations
>Solution :
Try this. "The trick" is that you need to ensure there’s at least one each of "a" and "b". That’s easy to do if you make two cases of it. The non-capturing group ("(?:…)") is so that re.findall() shows the part you actually care about:
>>> import re
>>> re.findall("(?:a+b|b+a)[ab]*", 'aaaaabbbbsaaaaaaaa')
['aaaaabbbb']