How to get just the node text, not the children's text in puppeteer

Advertisements

Presume the website layout looks like this:

<div id="monday">
  ...
  <div class="dish">
    Potato Soup
    <br>
    <span>With smoked tofu</span>
  </div>
</div>

How, using puppeteer, would I be able to grab just the text node’s content, not everything inside .dish?

I’ve tried

let selector = await page.waitForSelector("#monday .dish");
let text = await selector.evaluate(el => el.textContent) ?? "";

but that returns "Potato SoupWith smoked tofu"

>Solution :

textContent is meant for that. What you can do is select the first TEXTNODE like below :

let text = await selector.evaluate(el => Array.from(el.childNodes)
                               .find(node=> node.nodeType === 3)?.textContent)

nodeType === 3 means it’s a text node. or you can use nodeName === '#text'

const elem = document.querySelector("#monday .dish");

const textNode = Array.from(elem.childNodes).find(r=> r.nodeType === 3)?.textContent;

console.log(textNode)
<div id="monday">
  <div class="dish">
    Potato Soup
    <br>
    <span>With smoked tofu</span>
  </div>
</div>

Leave a ReplyCancel reply