Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Don't expand on results when using Google Sheets' importXML

Google Sheets has a cool function called importxml. It lets you scrape data from public web sources e.g. a URL’s title.

You can write an XPath expression e.g. //title/text() and it will find all occurrences of some value and display those in your Google Sheet as separate rows. Nifty!

If the contents of <element></element> contain HTML markup e.g. <element>some<b>data</b>is<em>here</em>. then Google Sheets gives you two options:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  1. if you append /text() at the end of your XPath then the result is the direct text value of the element e.g. some here
  2. if you do not append /text() at the end of your XPath then the result is the entire text split across multiple cells in the row. You end up with |some|data|is|here|. (where | acts as my cell divider)

I am trying to get some data is here in a single cell, not across multiple cells on my row. Is there a way to do that?

Using textjoin or concatenate on importxml doesn’t work because it then applies it to the entire result and if there are multiple instances of <element>some<b>data</b>is<em>here</em>. then I get them all in a single cell instead of one result per row (which is what I want).

>Solution :

try:

=TEXTJOIN(" "; 1; your_importxml_here)

for array use:

=FLATTEN(QUERY(TRANSPOSE(your_importxml_here);;9^9))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading