Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to make all values in an array fall into a range?

Say I have a NumPy array of floats, there are positive values and negative values. I have two numbers, say they are a and b, a <= b and [a, b] is a (closed) number range.

I want to make all of the array fall into the range [a, b], more specifically I want to replace all values outside of the range with the corresponding terminal value.

I am not trying to scale values to fit numbers into a range, in Python that would be:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

[a + (e - a) / (b - a) for e in arr]

Or in NumPy:

a + (arr - a) / (b - a)

I am trying to replace all values lower than a with a and all values higher than b with b, while leaving all other values unchanged, I can do it in a single list comprehension in Python:

[e if a <= e <= b else (a if e < a else b) for e in arr]

I can do the same with two broadcasts:

arr[arr < a] = a
arr[arr > b] = b

Even though NumPy is way faster than Python, the above is two loops, not one, the method is inefficient but compiled.

What is a faster way?


I have done the measurement, multiple times, and Python is indeed much slower as expected:

In [1]: import numpy as np

In [2]: numbers = np.random.random(4096) * 1024

In [3]: %timeit numbers[numbers < 256]
16.1 µs ± 219 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit numbers[numbers > 512]
20.9 µs ± 526 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [5]: %timeit [e if 256 <= e <= 512 else (256 if e < 256 else 512) for e in numbers]
927 µs ± 101 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [6]: %timeit [e if 256 <= e <= 512 else (256 if e < 256 else 512) for e in numbers.tolist()]
684 µs ± 38.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

>Solution :

You can use the np.clip

Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.

It is faster the the broadcasts way.

Code Example:

import numpy as np

arr = np.array([-3, 5, 10, -7, 2, 8, -12, 15])

a = 0
b = 10

new_arr = np.clip(arr, a, b)
print(new_arr)

TIME MEASURMENT

For Array size of 1000
Method 1 (List comprehension) time: 0.0115 seconds
Method 2 (NumPy broadcasts) time: 0.0009 seconds
Method 3 (np.clip()) time: 0.0009 seconds

-----------------------------------------------------------------

For Array size of 10000
Method 1 (List comprehension) time: 0.1137 seconds
Method 2 (NumPy broadcasts) time: 0.0069 seconds
Method 3 (np.clip()) time: 0.0017 seconds

-----------------------------------------------------------------

For Array size of 100000
Method 1 (List comprehension) time: 1.3205 seconds
Method 2 (NumPy broadcasts) time: 0.1152 seconds
Method 3 (np.clip()) time: 0.0107 seconds

-----------------------------------------------------------------

For Array size of 1000000
Method 1 (List comprehension) time: 13.8250 seconds
Method 2 (NumPy broadcasts) time: 1.0064 seconds
Method 3 (np.clip()) time: 0.1973 seconds

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading