Home Is it OK to make every second thousands of non-blocking HTTP requests getting responses in seconds and minutes? Is there a successful example?

Questions

Is it OK to make every second thousands of non-blocking HTTP requests getting responses in seconds and minutes? Is there a successful example?

byMR

April 12, 2024

I am considering implementing a top level general backend calling various specialized backends via HTTP using non-blocking requests (using promises, futures etc.).

The responses to the calls/requests to other backends may take seconds and even minutes.
I understand every non-blocking request should put something like a promise or future into some queue of promises/futures of my top level general (non-specialized) backend.
The queue of promises/futures grows when the top level backend makes a request.
The queue gets shorter when a requests from the queue gets a response and the corresponding promise/future is used.

I cannot get one thing.
If my top level general backend makes, let’s say, 1000 HTTP requests a second and some requests may take seconds (quite slow) and some minutes (really slow) to get responses and on average, let’s assume, a request takes 1 minutes to get its response then it should mean the underlying language/technology/framework and hardware/RAM/disk/whatever should be able to keep/maintain/process a queue of 60K promises/futures.
Then if the rate is higher like 100K requests per second then there should be ability to maintain a queue of 6M promises/futures.

Something in me tells me that having 6M outstanding promises/futures at every single moment of runtime of my top level backend is wrong, unreliable and unlikely to be supported by various backend technologies like non-blocking Java frameworks/Node.js/other non-blocking frameworks in various programming languages.

I can think of scaling out the top level backend (to multiple lesser uniform instances) to keep its promise/futures queue within the capabilities of the technologies/hardware chosen.
Yet I have a feeling that the very approach of long lasting requests/promises/futures is wrong and there should be something else I cannot remember or am not aware of.

I feel like making specialized backends responding immediately with ‘acknowledge, working’ and polling for results could be a better option yet it will delay the responses got faster by the time fraction of polling rate interval.

The question is not bound to any specific language, framework or technology. Yet if you feel you can answer using any language/framework/technology of your choice using certain numbers and code snippets please share your answer.

The very question is:
Is it OK to make every second thousands of non-blocking HTTP requests getting responses in seconds and minutes?
Please share an example of something working like this successfully.

EDIT: I am considering PC only.

P.S. I hope the question allows for certain precise answers and stays conformant to SO rules.