For some reason, when I
re.compile a list of unioned regex patterns it seems as though some patterns work and some do not. Can’t figure out the issue here though. Any guidance appreciated.
import re creditcard_pattern = re.compile(r'''( (CREDIT\s?CA?RD)| ((CARD|\bCC\b).*PA?YME?N?T?)| (APPLECARD)| WELLS FARGO.*(CARD|CC)| (CITI.*(C?R?E?D?I?T CA?R?D))| (CAPITAL ONE)| AMERICAN EXPRESS| (DISCOVER.*(?!.*1BANK))| AMER?I?C?A?N?\s?E?XP?R?E?S?S?| CHASE.*CARD| (BA?N?K.*AME?RI?C?A?.*PMT)| AMEX| CITICORP CHOICE| CITI (CARD|AUTO|PAYMENT)| VISA PLATINUM| BARCLAY.*CARD| USAA FSB.*ONLINE PMT| CITIBANK.*ONLINE PMT ) ''', flags=re.I | re.X )
if creditcard_pattern.search('CARD PYMT'): print('found') #>> found if creditcard_pattern.search('BARCLAY CARD'): print('found') #>> found if creditcard_pattern.search('WELLS FARGO CARD'): print('found') #>> not found if creditcard_pattern.search('CAPITAL ONE'): print('found') #>> not found
When testing the patterns in https://regexr.com/ my patterns seem to work as expected…
The documentation for re.X states:
Whitespace within the pattern is ignored, except when in a character
class, or when preceded by an unescaped backslash, or within tokens
like *?, (?: or (?P<…>.
So you could escape the single space in ‘CAPITAL ONE’, the corresponding line in your regex becomes: