How to compare a three dimensional array with a two dimensional array in NumPy?

I have a three-dimensional array of shape (height, width, 3), it represents an BGR image, the values are floats in [0, 1].

After some operation on the pixels I obtain a two-dimensional array of shape (height, width), the values in the array are the results of some operation performed on each individual pixel.

Now I want to compare the original image with the result, more specifically I want to compare each of the BGR components of each pixel with the value of the result array located at the same coordinate.

For instance, I want to know which of the BGR component is the greatest in each pixel:

import numpy as np

img = np.random.random((360, 640, 3))
maxa = img.max(axis=-1)

Now I want to compare img with maxa, I know img == maxa doesn’t work:

In [335]: img == maxa
<ipython-input-335-acb909814b9a>:1: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
  img == maxa
Out[335]: False

I am not good at describing things, so I will show you what I intend to do in Python:

result = [[[c == maxa[y, x] for c in img[y, x]] for x in range(640)] for y in range(360)]

Obviously it is inefficient but I want to demonstrate I know the logic.

I have managed to do the same in NumPy, but I think it can be more efficient:

img == np.dstack([maxa, maxa, maxa])

I have confirmed my comprehension’s correctness:

In [339]: result = [[[c == maxa[y, x] for c in img[y, x]] for x in range(640)] for y in range(360)]
     ...: np.array_equal(arr3, img == np.dstack([maxa, maxa, maxa]))
Out[339]: True

And I have benchmarked my methods:

In [340]: %timeit [[[c == maxa[y, x] for c in img[y, x]] for x in range(640)] for y in range(360)]
509 ms ± 16.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [341]: maxals = maxa.tolist()

In [342]: imgls = img.tolist()

In [343]: %timeit [[[c == maxals[y][x] for c in imgls[y][x]] for x in range(640)] for y in range(360)]
156 ms ± 2.57 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [344]: %timeit img == np.dstack([maxa, maxa, maxa])
4.25 ms ± 121 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

What is a faster method?

>Solution :

If this is what you want:

img == np.dstack([maxa, maxa, maxa])

Then this is how you do it:

img == maxa[..., np.newaxis]

Numpy broadcasting rules mean that outer dimensions are added as-required but not inner dimensions. That means you can compare shapes [x, y, z] with shape [y, z] (for all x, do …) but not [x, y] (for all z, do …). However, an inner dimension can be set to 1, allowing appropriate broadcasting. This is what we do with np.newaxis. It turns the [x, y] shape into [x, y, 1].

The deprecation warning was just for comparing arrays of incompatible shape and getting a False return value; not for using == with arrays.

Leave a Reply