Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to get response status of large list of subdomains?

I have been trying to check the status of all these subdomains all at once and I have tried multiple techniques even grequests and faster than requests wasn’t much helpful and then i started using asyncio with aiohttp and it is slower than normal requests library now. Also i checked that it wasn’t actually sending the requests asynchronously rather it was sending one after another.

I know that "await resp.status" has issues because resp.status does not support await but i tried removing it and its still the same.

Any help would be much appreciated! Thanks!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import aiohttp
import asyncio
import time

start_time = time.time()


async def main():
#List of 1000 subdomains , Some subdomains do not exist 

    data = [ "LIST OF 1000 SUBDOMAINS" ]

    async with aiohttp.ClientSession() as session:
        for url in data:
            pokemon_url = f'{url}'
            try:
                async with session.get(pokemon_url, ssl=False) as resp:
                    pokemon = await resp.status
               #If subdomain exists then print the status
                    print(pokemon)
            except:
               #else print the subdomain which does not exist or cannot be reached

                print(url)

asyncio.run(main())
print("--- %s seconds ---" % (time.time() - start_time))

>Solution :

I have tried multiple techniques even grequests

grequests works fine for this, you don’t have to use async if you don’t want.

import grequests
import time

urls = ['https://httpbin.org/delay/4' for _ in range(4)]
# each of these requests take 4 seconds to complete
# serially, these would take at least 16 (4 * 4) seconds to complete

reqs = [grequests.get(url) for url in urls]
start = time.time()
for resp in grequests.imap(reqs, size=4):
    print(resp.status_code)
end = time.time()
print('finished in', round(end-start, 2), 'seconds')
200
200
200
200
finished in 4.32 seconds
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading