Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

InvalidSchema: No connection adapters were found for "link"?

I have a dataset with multiple links and I’m trying to get the text of all the links using the code below, but I’m getting a error message "InvalidSchema: No connection adapters were found for "’https://en.wikipedia.org/wiki/Wagner_Group’".

Dataset:

   links
   'https://en.wikipedia.org/wiki/Wagner_Group'
   'https://en.wikipedia.org/wiki/Vladimir_Putin'
   'https://en.wikipedia.org/wiki/Islam_in_Russia'

The code I’m using to web-scrape is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

def get_data(url): 
    page = requests.get(url)
    soup = BeautifulSoup(page.content,'html.parser')
    text = ""
    for paragraph in soup.find_all('p'):
        text += paragraph.text
    return(text)

#works fine
url = 'https://en.wikipedia.org/wiki/M142_HIMARS'
get_data(url)

#Doesn't work

df['links'].apply(get_data)
Error: InvalidSchema: No connection adapters were found for "'https://en.wikipedia.org/wiki/Wagner_Group'"

Thank you in advance

#It works just fine when I apply it to a single url but it doens’t work when I apply
it to a dataframe.

>Solution :

df['links'].apply(get_data) is not compatible with requests and bs4.
You can try one of the right ways as follows:

Example:

import requests
from bs4 import BeautifulSoup
import pandas as pd
links =[
    'https://en.wikipedia.org/wiki/Wagner_Group',
    'https://en.wikipedia.org/wiki/Vladimir_Putin',
    'https://en.wikipedia.org/wiki/Islam_in_Russia']
  
data = []
for url in links:
    req = requests.get(url)
    soup = BeautifulSoup(req.text,'lxml')
    
    for pra in soup.select('div[class="mw-parser-output"] > table~p'):
        paragraph = pra.get_text(strip=True)

        data.append({
            'paragraph':paragraph
            })
#print(data)
df = pd.DataFrame(data)
print(df)

Output:

                            paragraph
0    TheWagner Group(Russian:Группа Вагнера,romaniz...
1    The group came to global prominence during the...
2    Because it often operates in support of Russia...
3    The Wagner Group first appeared in Ukraine in ...
4    The Wagner Group itself was first active in 20...
..                                                 ...
440  A record 18,000 Russian Muslim pilgrims from a...
441  For centuries, theTatarsconstituted the only M...
442  A survey published in 2019 by thePew Research ...
443         Percentage of Muslims in Russia by region:
444  According to the 2010 Russian census, Moscow h...

[445 rows x 1 columns]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading