Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python – looking for a faster way to extract substrings from string

I have a long string, and I’ve extracted the substrings I wanted. I am looking for a method which uses less lines of code to get my output. I’m after all the sub strings which start with CN=., and removing everything else up-to the semi-colon..

example list output (see picture)

picture of output

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The script I’m currently using is below

    import re
    import fnmatch
    import os
    
    # System call
    os.system("")
    
    # Class of different styles
    class style():
        BLACK = '\033[30m'
        RED = '\033[31m'
        GREEN = '\033[32m'
        YELLOW = '\033[33m'
        BLUE = '\033[34m'
        MAGENTA = '\033[35m'
        CYAN = '\033[36m'
        WHITE = '\033[37m'
        UNDERLINE = '\033[4m'
        RESET = '\033[0m'
    
    CNString = "CN=User2,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=User4,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=User56,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=User9,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=Jane45 user,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=User-Donna,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=User76 smith,OU=blurb,OU=Test4,DC=Test,DC=Testal;CN=Pink Panther,OU=blurb,OU=Test,DC=Testing,DC=Testal;CN=Testuser78,OU=blurb,OU=Tester,DC=Test,DC=Testal;CN=great Scott,OU=blurb,OU=Test,DC=Test,DC=Local;CN=Leah Human,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=Alan Desai,OU=blurb,OU=Test,DC=Test,DC=Testal;CN=Duff Beer,OU=Groups,OU=Test,DC=Test,DC=Testal;CN=Jane Doe,OU=Users,OU=Test76,DC=Test,DC=Testal;CN=simple user67,OU=Users,OU=Test,DC=Test,DC=Testal;CN=test O'Lord,OU=Users,OU=Test,DC=Concero,DC=Testal"
    
    newstring1 = CNString.replace(';','];')
    print(newstring1)
    
    newstring2 = newstring1.replace(',OU=',',[OU=')
    print(newstring2)
    
    newstring3 = newstring2.replace(',[OU','],[OU')
    print(newstring3)
    
    newstring4 = newstring3.replace('],[OU',',[OU')
    print(newstring4)
    
    newstring5 = newstring4.replace('];',']];')
    print(newstring5)
    
    endstring = "]]"
    newstring6 = newstring5 + endstring
    print(newstring6)
    
    newstring7 = re.sub("\[.*?\]","()",newstring6)
    print(newstring7)
    
    print(style.YELLOW + "Line Break")
    
    newstring8 = newstring7.replace(',()]','')
    print(style.RESET + newstring8)
    
    newstring9 = newstring8.split(';')
    for cnname in newstring9:
        print(style.GREEN + cnname)

>Solution :

Not sure why your code is juggling with those square brackets. Wouldn’t this do it?

names = re.findall(r"\bCN=[^,;]*", CNString)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading