Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Selecting 4th child div using BeautifulSoup

I have a 5th child div under main div which id is main_div but child div’s has no id or class.

I am trying to get the text from 4th child "div text 04"

Here is my html:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

<div id="main_div">
    <div>div 01</div>
    <div>div 02</div>
    <div>div 03</div>
    <div>div text 04</div>
    <div>div 05</div>
</div>

I’m trying with this but it’s not working because there is no class in child div. How can I get 4th child div text?

soup = bs(r.text, 'html.parser')
html_soup = soup.find('div', {"id": 'main'})

Thanks

>Solution :

If you’re sure the structure won’t change and all you want it the 4th div then try this, for example:

from bs4 import BeautifulSoup

sample_html = """<div id="main_div">
    <div>div 01</div>
    <div>div 02</div>
    <div>div 03</div>
    <div>div text 04</div>
    <div>div 05</div>
</div>"""

soup = (
    BeautifulSoup(sample_html, 'html.parser')
    .find(id='main_div')
    .find_all('div')[-2]
    .text
)
print(soup)

Or use CSS selector:

soup = BeautifulSoup(sample_html, 'html.parser')
parent = soup.find(id="main_div")
# assign child value
n = 4
print(parent.select_one("div:nth-of-type(" + str(n) + ")").getText())

Output:

div text 04
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading