Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split list of promises to groups. Execute all promises in group parallel and groups in sequance

What I do:

I’m crawling the web, the way I do it is that I have got a list of website links and create a promise from each (the promise is basically a crawler). And I do it in sequence so for example if I have 10 links I will crawl the first link, wait for it to finish, crawl second link, etc.

What I need:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What I’m trying to achieve is to group my promises. Each group will run in parallel but list of groups will run in sequence.

So for example I have 10 links, and I will create 10 promises from them.
After that, I will split promises into groups with max 3 promises per group.
After that it should crawl first 3 (as they are first group), wait for them to finish and then run 4th, 5th 6th as they are second group etc.

What I tried:

I created a method to split promises:

export function splitPromises<T>(promises: Promise<T>[], maxPerItem: number): Promise<T>[][] {
  const splitPromisesList: Promise<T>[][] = [];
  let currentSplit: Promise<T>[] = [];

  for (let i = 0; i < promises.length; i++) {
    currentSplit.push(promises[i]);

    if (currentSplit.length === maxPerItem || i === promises.length - 1) {
      splitPromisesList.push(currentSplit);
      currentSplit = [];
    }
  }

  return splitPromisesList;
}

After that method which will use that splitting and call promises:

async function crawler(links: string[], page: Page): Promise<MyData[]> {
  const list: MyData[] = [];

  const crawlPromises = links.map(async (link, index) => {
    try {
      const newPage = await page.browser().newPage();
      const detail = await crawlLink(link, newPage);
      await newPage.close();
      return detail;
    } catch (e) {
      console.log(e);
      return null as MyData;
    }
  });

  const groupedPromises = splitPromises<MyData>(crawlPromises, 3);
  let results: MyData[] = [];

  for (const group of groupedPromises) {
    results = await Promise.all(group);
    const filteredResults: MyData[] = results.filter((detail) => detail !== null) as MyData[];
    list.push(...filteredResults);
  }


  return list;
}

What are my issues:
I’m not sure what I’m doing wrong but it executes all promises at once, not by groups.

>Solution :

Once the promise has been created, the work is already in flight. awaiting the promises in batches won’t delay the work. You instead need to batch the creation of the promises.

A function for splitting an array into chunks is still useful, but you need to make it work on an array of strings, not just an array of promises:

export function splitArray<T>(array: T[], maxPerItem: number): T[][] {
  const splitList: T[][] = [];
  // ... basically the same implementation as before, with different variable names
  return splitList;
}

And then you need to chunk the links and create your array of promises from just that one chunk. You’ll then wait for the chunk to finish before moving on to the next chunk.

async function crawler(links: string[], page: Page): Promise<MyData[]> {
  const list: MyData[] = [];

  const chunks = splitArray(links, 3);

  for (const chunk of chunks) {
    const crawlPromises = chunk.map(async (link, index) => {
      // .. same as before, except we're mapping over `chunk` instead of `links`
    });
    const results = await Promise.all(crawlPromises);
    const filteredResults: MyData[] = results.filter((detail) => detail !== null) as MyData[];
    list.push(...filteredResults);
  }

  return list;
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading