Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to pull row from an html table by python

I’m trying to pull a number that is in a td, but this td has repeated classes, and the table doesn’t contain class or tr, how can I do to get this number(1,00)?

this is the html:

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

my code:

import requests
from bs4 import BeautifulSoup as BS

sample_website = ('https://www.gov.br/receitafederal/pt-br/assuntos/orientacao-tributaria/pagamentos-e-parcelamentos/taxa-de-juros-selic#Taxa_de_Juros_Selic')

page=requests.get(sample_website)

soup = BS(page.content, "html.parser")

for row in soup.select('table')[1:]:
    taxa = soup.select('tr')[5:]
    valor_especifico = row.find_all('td')[5:]

print(valor_especifico)

This is output:

C:\Users\Francisco\PycharmProjects\INSS\Scripts\python.exe C:/Users/Francisco/PycharmProjects/INSS/MODULOS/web.py
[<td class="xl74" style="text-align: center; "><strong>1999</strong></td>, <td class="xl75" height="19"> <strong>janeiro</strong></td>, <td class="xl80" style="text-align: center; ">391,17</td>, <td class="xl80" style="text-align: center; ">349,88</td>, <td class="xl80" style="text-align: center; ">326,26</td>, <td class="xl80" style="text-align: center; ">302,97</td>, <td class="xl80" style="text-align: center; ">277,88</td>, <td class="xl83" height="19"> <strong>fevereiro</strong></td>, <td class="xl80" style="text-align: center; ">387,54</td>, <td class="xl80" style="text-align: center; ">347,53</td>, <td class="xl80" style="text-align: center; ">324,59</td>, <td class="xl80" style="text-align: center; ">300,84</td>, <td class="xl80" style="text-align: center; ">275,50</td>, <td class="xl83" height="19"> <strong>março</strong></td>, <td class="xl80" style="text-align: center; ">384,94</td>, <td class="xl80" style="text-align: center; ">345,31</td>, <td class="xl80" style="text-align: center; ">322,95</td>, <td class="xl80" style="text-align: center; ">298,64</td>, <td class="xl80" style="text-align: center; ">272,17</td>, <td class="xl83" height="19"> <strong>abril</strong></td>, <td class="xl80" style="text-align: center; ">380,68</td>, <td class="xl80" style="text-align: center; ">343,24</td>, <td class="xl80" style="text-align: center; ">321,29</td>, <td class="xl80" style="text-align: center; ">296,93</td>, <td class="xl80" style="text-align: center; ">269,82</td>, <td class="xl83" height="19"> <strong>maio</strong></td>, <td class="xl80" style="text-align: center; ">376,43</td>, <td class="xl80" style="text-align: center; ">341,23</td>, <td class="xl80" style="text-align: center; ">319,71</td>, <td class="xl80" style="text-align: center; ">295,30</td>, <td class="xl80" style="text-align: center; ">267,80</td>, <td class="xl83" height="19"> <strong>junho</strong></td>, <td class="xl80" style="text-align: center; ">372,39</td>, <td class="xl80" style="text-align: center; ">339,25</td>, <td class="xl80" style="text-align: center; ">318,10</td>, <td class="xl80" style="text-align: center; ">293,70</td>, <td class="xl80" style="text-align: center; ">266,13</td>, <td class="xl83" height="19"> <strong>julho</strong></td>, <td class="xl80" style="text-align: center; ">368,37</td>, <td class="xl80" style="text-align: center; ">337,32</td>, <td class="xl80" style="text-align: center; ">316,50</td>, <td class="xl80" style="text-align: center; ">292,00</td>, <td class="xl80" style="text-align: center; ">264,47</td>, <td class="xl83" height="19"> <strong>agosto</strong></td>, <td class="xl80" style="text-align: center; ">364,53</td>, <td class="xl80" style="text-align: center; ">335,35</td>, <td class="xl80" style="text-align: center; ">314,91</td>, <td class="xl80" style="text-align: center; ">290,52</td>, <td class="xl80" style="text-align: center; ">262,90</td>, <td class="xl83" height="19"> <strong>setembro</strong></td>, <td class="xl80" style="text-align: center; ">361,21</td>, <td class="xl80" style="text-align: center; ">333,45</td>, <td class="xl80" style="text-align: center; ">313,32</td>, <td class="xl80" style="text-align: center; ">288,03</td>, <td class="xl80" style="text-align: center; ">261,41</td>, <td class="xl83" height="19"> <strong>outubro</strong></td>, <td class="xl80" style="text-align: center; ">358,12</td>, <td class="xl80" style="text-align: center; ">331,59</td>, <td class="xl80" style="text-align: center; ">311,65</td>, <td class="xl80" style="text-align: center; ">285,09</td>, <td class="xl80" style="text-align: center; ">260,03</td>, <td class="xl83" height="19"> <strong>novembro</strong></td>, <td class="xl80" style="text-align: center; ">355,24</td>, <td class="xl80" style="text-align: center; ">329,79</td>, <td class="xl80" style="text-align: center; ">308,61</td>, <td class="xl80" style="text-align: center; ">282,46</td>, <td class="xl80" style="text-align: center; ">258,64</td>, <td class="xl83" height="19"> <strong>dezembro</strong></td>, <td class="xl80" style="text-align: center; ">352,46</td>, <td class="xl80" style="text-align: center; ">327,99</td>, <td class="xl80" style="text-align: center; ">305,64</td>, <td class="xl80" style="text-align: center; ">280,06</td>, <td class="xl80" style="text-align: center; ">257,04</td>]

Process finished with exit code 0

>Solution :

If I understand you correctly you want to select value 1,00 from the table Taxa de Juros Selic Acumulada Mensalmente:

import requests
from bs4 import BeautifulSoup


url = "https://www.gov.br/receitafederal/pt-br/assuntos/orientacao-tributaria/pagamentos-e-parcelamentos/taxa-de-juros-selic#Taxa_de_Juros_Selic"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

# select correct table:
table = soup.select_one("#Selicmensalmente").find_next("table")

# select actual row (that contains "maio")
current_row = soup.select_one("tr:-soup-contains(maio)")

# get all non-empty values:
values = [s for td in current_row if (s := td.get_text(strip=True))]

# print last one:
print(values[-1])

Prints:

1,00
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading