Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

requests and Beautifulsoup <tables>

I’m trying to pull just one data from the web table. how could i make this code to pull only one data inside the table?

I’m trying to pull just the value 0.83 how could I do that?

    import requests
    from bs4 import BeautifulSoup
    
    
    url = 'https://www.gov.br/receitafederal/pt-br/assuntos/orientacao-`tributaria/pagamentos-e-parcelamentos/taxa-de-juros-selic#Taxa_de_Juros_Selic'`
    
    headers = {"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36"}
    
    
    page = requests.get(url ,headers=headers)
    
    
    #print(page.content)
    #span class = DFlfde SwHCTb
    soup = BeautifulSoup(page.content, "html.parser")
    
    valor_taxa = soup.find_all("table",class_ ="listing" )[0]
    valor_tr = soup.find_all("tr",class_="odd")
    valor_especifico = soup.select('td', class_={'align': 'CENTER'})
    
    
    print(valor_especifico)

The output is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

C:\Users\Francisco\PycharmProjects\INSS\Scripts\python.exe C:/Users/Francisco/PycharmProjects/INSS/web.py
[<td>
<ul>
<li></li>
</ul>
</td>, <td> <strong><a class="anchor-link" href="#Taxa_de_Juros_Selic" target="_self" title="">Taxa de Juros Selic</a></strong></td>, <td>
<ul>
<li></li>
</ul>
</td>, <td><strong><a class="anchor-link" href="#Selicacumulada" target="_self" title=""> </a><a class="anchor-link" href="#Selicmensalmente" target="_self" title="">Taxa de Juros Selic Acumulada Mensalmente</a></strong></td>, <td>
<ul>
<li></li>
</ul>
</td>, <td> <a class="anchor-link" href="#Taxa" target="_self" title=""><strong>Taxa de Juros Selic Incidente sobre as Quotas do Imposto de Renda Pessoa Físic</strong>a</a></td>, <td align="LEFT"><b>Mês/Ano</b></td>, <td align="CENTER"><b>2013</b></td>, <td align="CENTER"><b>2014</b></td>, <td align="CENTER"><b>2015</b></td>, <td align="CENTER"><b>2016</b></td>, <td align="CENTER"><b>2017</b></td>, <td align="CENTER"><b>2018</b></td>, <td align="CENTER"><b>2019</b></td>, <td align="CENTER"><b>2020</b></td>, <td align="CENTER"><b>2021</b></td>, <td align="CENTER"><b>2022</b></td>, <td align="LEFT"><b>Janeiro</b></td>, <td align="CENTER">0,60%</td>, <td align="CENTER">0,85%</td>, <td align="CENTER">0,94%</td>, <td align="CENTER">1,06%</td>, <td align="CENTER">1,09%</td>, <td align="CENTER">0,58%</td>, <td align="CENTER">0,54%</td>, <td align="CENTER">0,38%</td>, <td align="CENTER">0,15%</td>, <td align="CENTER">0,73%</td>, <td align="LEFT"><b>Fevereiro</b></td>, <td align="CENTER">0,49%</td>, <td align="CENTER">0,79%</td>, <td align="CENTER">0,82%</td>, <td align="CENTER">1,00%</td>, <td align="CENTER">0,87%</td>, <td align="CENTER">0,47%</td>, <td align="CENTER">0,49%</td>, <td align="CENTER">0,29%</td>, <td align="CENTER">0,13%</td>, <td align="CENTER">0,76%</td>, <td align="LEFT"><b>Março</b></td>, <td align="CENTER">0,55%</td>, <td align="CENTER">0,77%</td>, <td align="CENTER">1,04%</td>, <td align="CENTER">1,16%</td>, <td align="CENTER">1,05%</td>, <td align="CENTER">0,53%</td>, <td align="CENTER">0,47%</td>, <td align="CENTER">0,34%</td>, <td align="CENTER">0,20%</td>, <td align="CENTER">0,93%</td>, <td align="LEFT"><b>Abril</b></td>, <td align="CENTER">0,61%</td>, <td align="CENTER">0,82%</td>, <td align="CENTER">0,95%</td>, <td align="CENTER">1,06%</td>, <td align="CENTER">0,79%</td>, <td align="CENTER">0,52%</td>, <td align="CENTER">0,52%</td>, <td align="CENTER">0,28%</td>, <td align="CENTER">0,21%</td>, <td align="CENTER">0,83%</td>, <td align="LEFT"><b>Maio</b></td>, <td align="CENTER">0,60%</td>, <td align="CENTER">0,87%</td>, <td align="CENTER">0,99%</td>, <td align="CENTER">1,11%</td>, <td align="CENTER">0,93%</td>, <td align="CENTER">0,52%</td>, <td align="CENTER">0,54%</td>, <td align="CENTER">0,24%</td>, <td align="CENTER">0,27%</td>, <td align="CENTER"></td>, <td align="LEFT"><b>Junho</b></td>, <td align="CENTER">0,61%</td>, <td align="CENTER">0,82%</td>, <td align="CENTER">1,07%</td>, <td align="CENTER">1,16%</td>, <td align="CENTER">0,81%</td>, <td align="CENTER">0,52%</td>, <td align="CENTER">0,47%</td>, <td align="CENTER">0,21%</td>, <td align="CENTER">0,31%</td>, <td align="CENTER"></td>, <td align="LEFT"><b>Julho</b></td>, <td align="CENTER">0,72%</td>, <td align="CENTER">0,95%</td>, <td align="CENTER">1,18%</td>, <td align="CENTER">1,11%</td>, <td align="CENTER">0,80%</td>, <td align="CENTER">0,54%</td>

Process finished with exit code 0

>Solution :

There are more succinct ways to do this but by breaking down into individual steps may make it clearer.

Do a GET on the URL and check the HTTP status.

Build ‘soup’ from the response text.

Iterate over each table, tr and td finally printing all the text associated with the lower level tds.

import requests
from bs4 import BeautifulSoup as BS

(r := requests.get('https://www.gov.br/receitafederal/pt-br/assuntos/orientacao-tributaria/pagamentos-e-parcelamentos/taxa-de-juros-selic#Taxa_de_Juros_Selic')).raise_for_status()

soup = BS(r.text, 'lxml')

for table in soup.find_all('table', {'class': 'listing'}):
    for tr in table.find_all('tr', {'class': 'odd'}):
        for td in tr.find_all('td', {'align': 'CENTER'}):
            print(td.text)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading