Advertisements
Presume the website layout looks like this:
<div id="monday">
...
<div class="dish">
Potato Soup
<br>
<span>With smoked tofu</span>
</div>
</div>
How, using puppeteer, would I be able to grab just the text node’s content, not everything inside .dish
?
I’ve tried
let selector = await page.waitForSelector("#monday .dish");
let text = await selector.evaluate(el => el.textContent) ?? "";
but that returns "Potato SoupWith smoked tofu"
>Solution :
textContent
is meant for that. What you can do is select the first TEXTNODE
like below :
let text = await selector.evaluate(el => Array.from(el.childNodes)
.find(node=> node.nodeType === 3)?.textContent)
nodeType === 3
means it’s a text node. or you can use nodeName === '#text'
const elem = document.querySelector("#monday .dish");
const textNode = Array.from(elem.childNodes).find(r=> r.nodeType === 3)?.textContent;
console.log(textNode)
<div id="monday">
<div class="dish">
Potato Soup
<br>
<span>With smoked tofu</span>
</div>
</div>