Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Selenium C# drive.PageSource – 'is too long, or a component of the specified path is too long.'

I’m trying to pass the driver.PageSource from Selenium C# to HTML Agility Pack, but this line of code htmlDoc.Load(driver.PageSource); returns error: ‘…’ is too long, or a component of the specified path is too long.

p.s. Selenium Python and Beautiful Soup doesn’t produce this error, when I was trying to do the same thing in Python instead of C#.

How to resolve this problem?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Full Code:

using System;
using System.Threading;
using HtmlAgilityPack;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Support.UI;

namespace SeleniumSharp
{
    public static class WebScraping
    {
        public static void GetPageData()
        {
            // initial setup
            IWebDriver driver = new ChromeDriver();
            driver.Navigate().GoToUrl("<url>");

            // dropdown
            var dropdown1 = driver.FindElement(By.Id("cpMain_ucc1_ctl00_liResidentialFront"));
            dropdown1.Click();
            
            // enter search query
            var search = driver.FindElement(By.Id("cpMain_ucc1_ctl00_txtResidentialSearchBox"));
            search.Click();
            search.SendKeys("london");
            Thread.Sleep(3000);

            // submit search
            var submit = driver.FindElement(By.XPath("//div[@id='cpMain_ucc1_ctl00_pnlContentResidential']//a[@class='search-button']"));
            submit.Click();

            // Html Agility Pack
            HtmlDocument htmlDoc = new HtmlDocument();
            htmlDoc.Load(driver.PageSource);

            var address = htmlDoc.DocumentNode
                .SelectNodes("//div[@class='grid-address']")
                .ToList();

            foreach(var item in address)
            {
                Console.WriteLine(item.InnerText);
            }

        }

        
    }
}

This line of code returns error:

htmlDoc.Load(driver.PageSource);

Error:

'<html source>'is too long, or a component of the specified path is too long.
at System.IO.PathHelper.GetFullPathName(ReadOnlySpan`1 path, ValueStringBuilder& builder)
   at System.IO.PathHelper.Normalize(String path)
   at System.IO.Path.GetFullPath(String path)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at System.IO.StreamReader.ValidateArgsAndOpenPath(String path, Encoding encoding, Int32 bufferSize)  
   at System.IO.StreamReader..ctor(String path, Encoding encoding)
   at HtmlAgilityPack.HtmlDocument.Load(String path)

>Solution :

It is because you are using the method Load instead of LoadHtml. Load method consumes path to file that contains HTML, not HTML source (driver.PageSource).

// From File
var doc = new HtmlDocument();
doc.Load(filePath);

// From String
var doc = new HtmlDocument();
doc.LoadHtml(html);

So try to use

htmlDoc.LoadHtml(driver.PageSource);
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading