Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group substrings after two words

I have a string:

s = """YES string1 string2 YES string3 string4 string5 YES string6 NO String7 NO string8 string9 YES string10 string11"""

I need output like that:

wanted_output = {
"YES": [
    "string1 string2", 
    "string3 string4 string5", 
    "string6", 
    "string10 string11",
],
"NO" : ["String7", "string8 string9"]

}

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I have working function for that, but it looks not elegant for me. Do you know more elegant way to solve it?

def convert(text):
    words = text.split()
    yes = "YES"
    no = "NO"
    yes_list = []
    no_list = []
    current = ""
    for word in words:
        if word == yes:
            current = yes
            yes_list.append("|")
            continue
        if word == no:
            current = no
            no_list.append("|")
            continue
        if current == yes:
            yes_list.append(word)
        elif current == no:
            no_list.append(word)
    yes_str = " ".join(yes_list)
    no_str = " ".join(no_list)
    yes_list = yes_str.split("|")
    no_list = no_str.split("|")
    yes_list = [yes_str.strip() for yes_str in yes_list if yes_str]
    no_list = [no_str.strip() for no_str in no_list if no_str]

    return {"YES": yes_list, "NO": no_list}

>Solution :

replace the yes and no with characters(make sure it will not come in text) and then split.

s = """YES string1 string2 YES string3 string4 string5 YES string6 NO String7 NO string8 string9 YES string10 string11"""


def convert(text):
    data = s.replace('YES', '*YES*').replace('NO', '*NO*').split('*')
    data_strip = [i.strip() for i in data if i.strip()]
    yes_list = []
    no_list = []
    for ind, val in enumerate(data_strip):
        if 'YES' in val:
            yes_list.append(data_strip[ind + 1])
        if 'NO' in val:
            no_list.append(data_strip[ind + 1])
    return {"YES": yes_list, "NO": no_list}


print(convert(s))
>>> {'YES': ['string1 string2', 'string3 string4 string5', 'string6', 'string10 string11'], 'NO': ['String7', 'string8 string9']}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading