Python : extract matching parenthesis from string listing

In python I try to parse a list of data saved as a string.
I want to split the string s by matching parenthesis and return as a list l. The listing is either sperated by a semicolon (example 1) or by a comma (example 2).

I’m not quite fit in regular expressions, and don’t know how to setup the expression.

Should be something like :

separate by ";" or "," but ignore the ones inside matching parenthesis "(" and ")

s = "(A=12); (B=(B1=15); (B2=17)); C=Hallo(123)"
# ...
l = ["(A=12)", "(B=(B1 = 15); (B2=17))", "C=Hallo(123)"]
s = "A=15, (B=Otto(1234)), (C=18)"
# ...
l = ["A=15", "(B=12)", "(C=18)"]

Hope anyone can help me with this?

>Solution :

As explained in, for example, Regular expression to match balanced parentheses, RegEx is the wrong tool to match nested structures (even though certain RegEx implementations might have features to support this).

Regular expressions in general are based on regular grammars that are not suited to analyze recursive structures – typically context-free grammars and pushdown automatons are used for such tasks.

The linked Q&A provides alternatives and discusses possible solution and workarounds if you really need to tackle this with RegEx.

Leave a Reply