I have this .html code:

<div id="content">
            <ul id="tree">
                <li xmlns="" class="level top failed open">
                    <span><em class="time">
                            <div class="time">1.89 s</div>
                        </em>I need to get this text</span>

I need to get only the text that is outside all of the other tags (text is: I need to get this text).

I was trying to use this piece of code:

path = document.find('li', class_='level top').find_all("em")[-1].next_sibling
if not path:
    path = document.find('li', class_='level top failed open').find_all("em")[-1].next_sibling
return path

But I get an error: AttributeError: ‘NoneType’ object has no attribute ‘find_all’.

Does anybody know how to access this text?



Solution:

You can apply .contents and it will generate a list of output and the desired one is [-1]

html = '''
<div id="content">
 <ul id="tree">
  <li class="level top failed open" xmlns="">
    <em class="time">
     <div class="time">
      1.89 s
    I need to get this text


from bs4 import BeautifulSoup

txt= soup.select_one('#tree > li > span').contents[-1]


  I need to get this text

