Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to fix TypeError: argument of type 'NoneType' is not iterable in this circumstance?

I’m writing a script to go through a list of root urls and find email addresses. Sometimes it returns no results. I’ve accounted for this in the code, and have followed the instructions on the answers to this question on SO to fix it, but cannot seem to figure it out.

First I’m pulling in a list of URLs:

url_list_updated= 
    ['http://www.gfcadvice.com/',
     'https://trillionfinancial.com.sg/about-us/',
     'https://www.gen.com.sg/',
     'https://www.aam-advisory.com/',
     'https://www.proinvest.com.sg/',
     'http://www.gilbertkoh.com/',
     'https://dollarbureau.com/',
     'http://www.greenfieldadvisory.com/',
     'https://enpointefinancial.com/',
     'https://www.ippfa.com/']

Then, I’m using BeautifulSoup to find 'mailto:' and returning lists of those results:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for url in url_list_updated:
    response = requests.get(url)
    html_content = response.text
    
    soup = BeautifulSoup(html_content, 'html.parser')
    
    email_addresses = []
    for link in soup.find_all('a'):
#         if 'mailto:' != None and 'mailto:' in link.get('href'):
#         if 'mailto:' != '' and 'mailto:' in link.get('href'):
#         if 'mailto:' in link.get('href') != None:
        if 'mailto:' in link.get('href') != '':
            email_addresses.append(link.get('href').replace('mailto:', ''))
            print(email_addresses)
        else:
            pass

I know that some of the results will be empty because not every website has 'mailto:' info visible, so I’ve followed multiple solutions on SO for NoneType (which I have commented out for reference)

The traceback always gives me this same result, even when I’m accounting for the missing results.


      7     email_addresses = []
      8     for link in soup.find_all('a'):
      9 #         if 'mailto:' != None and 'mailto:' in link.get('href'):
     10 #         if 'mailto:' != '' and 'mailto:' in link.get('href'):
     11 #         if 'mailto:' in link.get('href') != None:
---> 12         if 'mailto:' in link.get('href') != '':
     13             email_addresses.append(link.get('href').replace('mailto:', ''))
     14             print(email_addresses)

TypeError: argument of type 'NoneType' is not iterable

What should I do differently?

>Solution :

The issue is the way you check it.
You are trying to check if a string is in something, and use that to also check if it’s different than ''. The first operation will always return a bool (or an error in this case) and thus, failing to collect the emails.

href = link.get('href')
if href is not None and 'mailto:' in href:
    email_addresses.append(href.replace('mailto:', ''))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading