Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Efficient numpy value assignment via boolean mask

I have a boolean mask value assigning problem the requires efficient boolean mask operation.

It’s a multi-dimension mask and i’m using einsum to achieve the result, but the operation is not very efficient, and i’m wondering, if i can get some help with it
Here is my current solution: (both mask, truth_value, false_value are dummy data with dtype and shape matches to my problem.

mask = np.random.randn(1000, 50)> 0.5
truth_value = np.random.randn(50, 10)
false_value = np.random.randn(10)
objective = np.einsum('ij,jk->ijk', mask, truth_value) + np.einsum('ij,k->ijk', ~mask, false_value)

Is there any faster way to get objective given mask, truth_value, false_value ?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

While i was waiting, figured out a faster way

objective = np.where(mask[...,np.newaxis], np.broadcast_to(truth_value, (1000, 50, 10)), np.broadcast_to(false_value,  (1000, 50, 10)))

But is there any faster alternative ?

>Solution :

You can use the Numba JIT to do that more efficiently.

import numpy as np
import numba as nb

@nb.njit('float64[:,:,::1](bool_[:,::1], float64[:,::1], float64[::1])')
def blend(mask, truth_value, false_value):
    n, m = mask.shape
    l = false_value.shape[0]
    assert truth_value.shape == (m, l)
    result = np.empty((n, m, l), dtype=np.float64)
    for i in range(n):
        for j in range(m):
            if mask[i, j]:
                result[i, j, :] = truth_value[j, :]
            else:
                result[i, j, :] = false_value[:]
    return result

mask = np.random.randn(1000, 50) > 0.5
truth_value = np.random.randn(50, 10)
false_value = np.random.randn(10)
objective = blend(mask, truth_value, false_value)

The computation of objective is 4.8 times faster on my machine.

If this is not fast enough, you can try to parallelize the code using the parameter parallel=True and using nb.prange instead of range in the i-based loop. This may not be faster due to the overhead of creating new threads. On my machine (with 6 cores), the parallel version is 7.4 times faster (the creation of threads is pretty expensive compared to the execution time).

Another possible optimization is to write directly the result in a buffer allocated ahead of time (this is only better if you call this function multiple times with the same array size).

Here are the overall timings on my machine:

np.einsum:         4.32 ms
np.where:          1.72 ms
numba sequential:  0.89 ms
numba parallel:    0.58 ms
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading