Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python : extract matching parenthesis from string listing

In python I try to parse a list of data saved as a string.
I want to split the string s by matching parenthesis and return as a list l. The listing is either sperated by a semicolon (example 1) or by a comma (example 2).

I’m not quite fit in regular expressions, and don’t know how to setup the expression.

Should be something like :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

separate by ";" or "," but ignore the ones inside matching parenthesis "(" and ")

s = "(A=12); (B=(B1=15); (B2=17)); C=Hallo(123)"
# ...
l = ["(A=12)", "(B=(B1 = 15); (B2=17))", "C=Hallo(123)"]
s = "A=15, (B=Otto(1234)), (C=18)"
# ...
l = ["A=15", "(B=12)", "(C=18)"]

Hope anyone can help me with this?

>Solution :

As explained in, for example, Regular expression to match balanced parentheses, RegEx is the wrong tool to match nested structures (even though certain RegEx implementations might have features to support this).

Regular expressions in general are based on regular grammars that are not suited to analyze recursive structures – typically context-free grammars and pushdown automatons are used for such tasks.

The linked Q&A provides alternatives and discusses possible solution and workarounds if you really need to tackle this with RegEx.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading