Goodafternoon, for a university python project I need to estract a table from a website, but the link doesn’t exist, so i need that my cycle ignore that link, and move to the next link. how can I do that?
i’m using the python language to create a dataset of soundtrack.
I used BeautifulSoup to extract the .html, but the link docent exist, so i think about putting a
but it doesn’t work. link is the result of soup.find that gave me as a result nothing, infant type(link) give me as a result NoneType.
what can i do to recognise the inexistent link?
thank you for the help
You can create a function to test if the URL is valid. If it generates an error, then it will return False, however if is creates a successful connection, it will return True. You can then use this function to filter your list to produce a new list of valid URLS.
Here is an example:
import requests url_list = ["http://yahoo.com", "http://a_random_site_that_does_not_exist.com", "http://google.com"] def is_valid_url(url): try: response = requests.get(url) response.raise_for_status() return True except requests.exceptions.RequestException: return False valid_url_list = list(filter(is_valid_url, url_list)) print(valid_url_list)