Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Scrapy Python. How do I get the "item_scraped_count" in the terminal?

I am new to python programming and webscraping. My intended code is to scrape the title, price and url of the book in the website. However, I was not able to get the wanted message in my terminal, which is "item_scraped_count" which ususally have been included in the tutorial terminal, in contrast of mine.

This is my code block

import scrapy


class BookspiderSpider(scrapy.Spider):
    name = "bookspider"
    allowed_domains = ["books.toscrape.com"]
    start_urls = ["https://books.toscrape.com/"]

    def parse(self, response):
        books = response.css('article.product_prod')

        for book in books:
            yield{
                'title' : book.css("h3 a::text").get(),
                'price' : book.css(".product_price .product_color::text").get(),
                'url' : book.css("h3 a").attrib['href']
            }

        next_page = response.css("li.next a::attr(href)").get()

        if next_page is not None:
            if 'catalogue/' in next_page:
                next_page_url = 'https://books.toscrape.com/' + next_page
            else:
                next_page_url = 'https://books.toscrape.com/catalogue/' + next_page
            yield response.follow(next_page_url, callback=self.parse)

This is the result in the terminal:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

{'downloader/request_bytes': 15159,
 'downloader/request_count': 51,
 'downloader/request_method_count/GET': 51,
 'downloader/response_bytes': 2556339,
 'downloader/response_count': 51,
 'downloader/response_status_count/200': 50,
 'downloader/response_status_count/404': 1,
 'elapsed_time_seconds': 20.302593,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2023, 12, 27, 12, 54, 23, 623889, tzinfo=datetime.timezone.utc),
 'log_count/DEBUG': 54,
 'log_count/INFO': 10,
 'request_depth_max': 49,
 'response_received_count': 51,
 'robotstxt/request_count': 1,
 'robotstxt/response_count': 1,
 'robotstxt/response_status_count/404': 1,
 'scheduler/dequeued': 50,
 'scheduler/dequeued/memory': 50,
 'scheduler/enqueued': 50,
 'scheduler/enqueued/memory': 50,
 'start_time': datetime.datetime(2023, 12, 27, 12, 54, 3, 321296, tzinfo=datetime.timezone.utc)}
2023-12-27 19:54:23 [scrapy.core.engine] INFO: Spider closed (finished) 

I would like to thank you in advance for helping me out with this problem

I tried to get the "item_scraped_count" in my terminal but it doesn’t shows up

>Solution :

There is no item_scraped_count stat because there were no items scraped.

There were no items scraped because your books var is empty, because the 'article.product_prod' selector didn’t select anything, because the correct class name is "product_pod".

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading