I would like to extract substrings from a string that is between special characters.
e.g a = "A1+A2*A3-A4"
Is there a way, I can get the output as A1,A2,A3,A4 if the operands list is ['*','/','+','-']
I know .split() works fine for a single operator but what would be the most optimal way to do it in a complicated expression ?
>Solution :
You can use re.split. you need to escape the characters as several have a special meaning in regex:
a = "A1+A2*A3-A4"
operators = ['*','/','+','-']
import re
re.split('|'.join(map(re.escape, operators)), a)
output: ['A1', 'A2', 'A3', 'A4']
Alternatively, if you only have single character operators, you can handcraft a character group:
re.split(r'[/*+-]', a)
alternatives
remove empty strings (anywhere):
a = "=A1+A2*A3-A4"
operators = ['=', '*','/','+','-']
l = list(filter(None, re.split('|'.join(map(re.escape, operators)), a)))
output: ['A1', 'A2', 'A3', 'A4']
remove empty string in the beginning:
a = "=A1+A2*A3-A4"
operators = ['=', '*','/','+','-']
l = re.split('|'.join(map(re.escape, operators)), a)
l = l[1:] if l and not bool(l[0]) else l
output: ['A1', 'A2', 'A3', 'A4']
keep the leading operator:
a = "=A1+A2*A3-A4"
operators = ['=', '*','/','+','-']
regex='(?!^)(?:%s)' % '|'.join(map(re.escape, operators))
l = re.split(regex, a)
output: ['=A1', 'A2', 'A3', 'A4']