Home Why do plus and minus have different promotion rules although the results are the same?

Questions

Why do plus and minus have different promotion rules although the results are the same?

August 4, 2023

I wonder why a - b and a + (-b) give the same result but in different types in numpy:

import numpy as np

minuend = np.array(1, dtype=np.int64)
subtrahend = 1 << 63

result_minus = minuend - subtrahend
result_plus = minuend + (-subtrahend)

print(result_minus == result_plus)  # True
print(type(result_minus))  # float64
print(type(result_plus))  # int64

Why is that, and where can I read about it?

>Solution :

1 << 63 cannot be represented using a int64 type, but -(1 << 63) can. This is a pathological case coming from how signed integers are represented in binary (C2 representation). When the value of an integer is too big to be converted to the biggest available integer type, Numpy convert it to floating-point type (64-bit ones if available). Binary operations involving both integer and floating-point type cause an integer to floating-point promotion in Numpy (inherited from the C language mainly because Numpy is written in C).

Here is a way to see that:

>>> np.int64(1 << 63)
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Cell In [7], line 1
----> 1 np.int64(1 << 63)

OverflowError: Python int too large to convert to C long

>>> np.int64(-(1 << 63))
-9223372036854775808

PS: this is actually a bit more complex here: one of the array is stored as uint64 because int64 is too small to hold the value, and then Numpy choose to store the final array in a float64 array because a binary operation on both uint64 and int64 array is likely to cause overflows (unsigned integers are too big to be stored in signed ones and negative signed integer cannot be represented in unsigned ones). Here is how it works:

>>> np.array(subtrahend).dtype
dtype('uint64')

# Numpy first convert both to float64 arrays to avoid overflows in this specific case due to mixed uint64+int64 integer types
>>> (np.array(subtrahend) + minuend).dtype
dtype('float64')

>>> np.array(-subtrahend).dtype
dtype('int64')