Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Select certain elements from a list with a loop and condition

I have a list similar to this one below, but much larger.

mylist = ['          12345678912        ST',
     '                         Halterung für Fortlüfterhaube',
     '                         Material/Werkstoff: Metall-Lackiert',
     '                         **Beginn Zeichnung**',
     '          98765432164        ST',
     '                         Klappe, komplett',
     '                         **Beginn Zeichnung**',
     '          74563254671        ST',
     '                         Sieb Außen-Dm 145 x 0,8mm',
     '                         Versatz Dm 122 x 5mm tief',
     '                         Material: Niro 1.4301 - Lochblech Dm1/LA1,5mm',
     '          90876487921        M',
     '                         Gista-Profil',
     '                         mit Moosgummihohlkammer-Dichtung (EPDM)',
     '                         Farbe: schwarz, Klemmbereich: 1-2 mm',
     '                         Material: EPDM, 60 +/- 5 Shore A,',
     '          64352647971       ST',
     '                         Winkelblech für Frost Erdungskontakt (AB 434 l)',
     '                         für TGr. 78.2',
     '                         Winkelblech für Frost Erdungskontakt (AB 434 l)',
     '                         für TGr. 78.2',
     '                         für TGr. 78.2',
     '                         Material/Werkstoff: X5CrNi 1810']

The goal for me is to extract the Material name (if present) for each ID in the list (along with the ID itself).

I’ve used the following code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Materials = []
iteration_list = mylist

for item in iteration_list:
    if str(item).strip().startswith("Material"):
        
        material_index = iteration_list.index(item)
        ID = "".join(re.findall(r'\d+', str(iteration_list[material_index - 1])))
        
        if len(ID) != 11:
            ID = "".join(re.findall(r'\d+', str(iteration_list[material_index - 2])))
            
            if len(ID) != 11:
                ID = "".join(re.findall(r'\d+', str(iteration_list[material_index - 3])))
                
                if len(ID) != 11:
                    ID = "".join(re.findall(r'\d+', str(iteration_list[material_index - 4])))  
                    
                    if len(ID) != 11:
                        ID = "".join(re.findall(r'\d+', str(iteration_list[material_index - 5])))
                        
                        if len(ID) != 11:
                            ID = "".join(re.findall(r'\d+', str(iteration_list[material_index - 6])))  
                        
        Materials.extend([ID, item])

Which produces this:

['12345678912',
 '                         Material/Werkstoff: Metall-Lackiert',
 '74563254671',
 '                         Material: Niro 1.4301 - Lochblech Dm1/LA1,5mm',
 '90876487921',
 '                         Material: EPDM, 60 +/- 5 Shore A,',
 '64352647971',
 '                         Material/Werkstoff: X5CrNi 1810']

So I first looked for the Material and then tried to extract the respective ID. The problem that I’m currently facing is that the Material is positioned randomly below each ID and it gets complicated/ugly with the IF statements to get the ID based on index relative to the index of the material.

My question is, is it possible to just somehow find ID (11-digit number) above each found Material, without writing many if statements to capture all the possible variations. (The ID is always 11-digit long).

>Solution :

Instead of searching back from the Material line to the previous ID line, just remember each ID as you encounter them in a variable. Then when you get to the Material line, the mid variable holds the value from the last ID before it.

materials = []

for line in mylist:
    line = line.strip()
    m = re.match(r'\d{11}\b', line)
    if m:
        mid = m.group()
    elif line.startswith("Material"):
        materials.append([mid, line])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading