I have an array like this:
array = np.random.randint(1, 100, 10000).astype(object)
array[[1, 2, 6, 83, 102, 545]] = np.nan
array[[3, 8, 70]] = None
Now, I want to find the indices of the NaN
items and ignore the None
ones. In this example, I want to get the [1, 2, 6, 83, 102, 545]
indices. I can get the NaN indices with np.equal
and np.isnan
:
np.isnan(array.astype(float)) & (~np.equal(array, None))
I checked the performance of this solution with %timeit and got the following result:
243 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Is there faster solution?
>Solution :
array != array
The classic NaN test. Writing NaN tests like this is one of the reasons that motivated the NaN != NaN design decision, since the IEEE 754 designers couldn’t assume programmers would have access to an isnan
routine.
This significantly outperforms the code in the question when I try it:
In [1]: import numpy as np
In [2]: array = np.random.randint(1, 100, 10000).astype(object)
...: array[[1, 2, 6, 83, 102, 545]] = np.nan
...: array[[3, 8, 70]] = None
In [3]: %timeit array != array
139 µs ± 46.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
In [4]: %timeit np.isnan(array.astype(float)) & (~np.equal(array, None))
755 µs ± 123 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
And of course, it does give the same output:
In [5]: result1 = array != array
In [6]: result2 = np.isnan(array.astype(float)) & (~np.equal(array, None))
In [7]: np.array_equal(result1, result2)
Out[7]: True