map vs. flatMap with Mono.fromSupplier for long-running operations

August 5, 2023

I’m using Project Reactor as part of a project using Spring WebFlux and learned that Flux.map should only be used for "simple" operations. From https://stackoverflow.com/a/49169314/947526:

map is for synchronous, non-blocking, 1-to-1 transformations

flatMap is for asynchronous (non-blocking) 1-to-N transformations

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.
Visit Medevel

In my understanding, something like fluxOfStrings.map(String::toUpperCase) is fine, but flux.map(this::computeStuff) is not (assuming computeStuff takes a considerable amount of time).

However, I fail to understand how (and if) I can use flatMap to improve my code. Based on the example above, I could write

fluxOfSomething.flatMap(value ->
    Mono.fromSupplier(
        () -> computeStuff(value)
    )
)

Is this better, in any sense? If so, how can I see the improvement using a simple test? If not, what can I do to trigger computeStuff instead?

>Solution :

Let’s try to see when to use each:

For computationally expensive tasks

If your computeStuff fun is just computationally expensive and doesn’t involve any I/O, you could use either map or flatMap based on how you manage threads. If you’re okay with the operation running on the same thread (typically, the event loop), map is fine too. If you want to offload it to another thread, you’ll need to wrap it in a Mono.fromSupplier and use flatMap instead, as the SOF answer suggested.

For I/O-bound or blocking tasks

If computeStuff does I/O or any blocking operation, you definitely should not use map. Instead, you may need to offload this to another thread. flatMap becomes useful here because you’re typically returning a Mono or Flux which represents the result of an async operation.

In your example:

fluxOfSomething.flatMap(value ->
    Mono.fromSupplier(
        () -> computeStuff(value)
    ).subscribeOn(Schedulers.boundedElastic()) 
)

You’re using flatMap cause you’re returning a Mono from the function. The combination of Mono.fromSupplier and .subscribeOn(Schedulers.boundedElastic()) (I added this) ensures that the computeStuff fun runs in another thread, not blocking the main reactive thread, which is generally recommended.

To notice a difference:

Blocking Scenario: If computeStuff is a blocking operation and you use map, you’ll quickly see issues when you put your service under load. The main reactive thread will get blocked, leading to a significant degradation in performance;
Async Scenario: If you use flatMap and offload the work to another thread, under the same load, you should see better performance since the main reactive thread isn’t blocked;
Simple Transformations: Simple transformations that don’t block or aren’t computationally intensive, there would be a negligible performance difference between using map & flatMap

Let’s try to write some dummy test now.

Prerequisites:
Let’s imagine that you’ve already set up the two endpoints (/map and /flatMap) and some blockingOperation (computeStuff) method

We may write something like:

@Test
public void testMapVsFlatMap() {
    long startMap = System.currentTimeMillis();
    webTestClient.get().uri("/map")
            .exchange()
            .expectStatus().isOk()
            .expectBodyList(String.class);
    long endMap = System.currentTimeMillis();

    long startFlatMap = System.currentTimeMillis();
    webTestClient.get().uri("/flatMap")
            .exchange()
            .expectStatus().isOk()
            .expectBodyList(String.class);
    long endFlatMap = System.currentTimeMillis();

    // flatMap is indeed faster than map for our simulated blocking OP
    assertTrue((endFlatMap - startFlatMap) < (endMap - startMap));
}