Why is casting np.uint8 array to np.float16 array is slower than casting np.uint8 to np.float32 array?

Advertisements

I couldn’t find any explanation to this, the ndarray.astype() returns a new array, so I was expecting it to be faster with np.float16 in comparison to np.float32 to since it allocates less memory. However it takes more than double the time.

original_array = np.ones([10,512,1280,3], dtype=np.uint8)

Here are the results :

%%timeit -r 10
float16_array = original_array.astype(np.float16)

93.5 ms ± 1.68 ms per loop (mean ± std. dev. of 10 runs, 10 loops each)

%%timeit -r 10
float32_array = original_array.astype(np.float32)

41.4 ms ± 278 µs per loop (mean ± std. dev. of 10 runs, 10 loops each)

>Solution :

Your CPU probably has an instruction that numpy can use to do the uint8->float32 conversion (for instance on x86, CVTDQ2PS in SSE2/AVX/AVX512 would work to do between four and sixteen conversions in a single instruction), but doesn’t have an equivalent instruction for float16. Half-precision float support is relatively sparse outside of GPUs.

Dev solutions

Solutions for development problems

Why is casting np.uint8 array to np.float16 array is slower than casting np.uint8 to np.float32 array?

>Solution :

Leave a ReplyCancel reply

>Solution :

Share this:

Leave a ReplyCancel reply

Discover more from Dev solutions