Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Is it possible to find the 0th index-position of a 2D numpy array (not) containing a given vaule?

Is it possible to find the 0th index-position of a 2D numpy array (not) containing a given vaule?

What I want, and expect

I have a 2D numpy array containing integers. My goal is to find the index of the array(s) that do not contain a given value (using numpy functions). Here is an example of such an array, named ortho_disc:

>>> ortho_disc 
Out: [[1 1 1 0 0 0 0 0 0]
      [1 0 1 1 0 0 0 0 0]
      [0 0 0 0 0 0 2 2 0]]

If I wish to find the arrays not containing 2, I would expect an output of [0, 1], as the first and second array of ortho_disc does not contain the value 2.

What I have tried

I have looked into np.argwhere, np.nonzero, np.isin and np.where without expected results. My best attempt using np.where was the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>>> np.where(2 not in ortho_disc, [True]*3, [False]*3) 
Out: [False False False]

But it does not return the expected [True, True, False]. This is especially weird after we look at the output ortho_disc‘s arrays evaluated by themselves:

>>> 2 not in ortho_disc[0] 
Out: True

>>> 2 not in ortho_disc[1] 
Out:True

>>> 2 not in ortho_disc[2]
Out: False

Using argwhere

Using np.argwhere, all I get is an empty array (not the expected [0, 1]):

>>> np.argwhere(2 not in ortho_disc) 
Out: []

I suspect this is because numpy first flattens ortho_disc, then checks the truth-value of 2 not in ortho_disc?
The same empty array is returned using np.nonzero(2 not in ortho_disc).

My code

import numpy as np
ortho_disc = np.array([[1, 1, 1, 0, 0, 0, 0, 0, 0],
                       [1, 0, 1, 1, 0, 0, 0, 0, 0],
                       [0, 0, 0, 0, 0, 0, 2, 2, 0,]])
polymer = 2

print(f'>>> ortho_disc \nOut:\n{ortho_disc}\n')
print(f'>>> {polymer} not in {ortho_disc[0]} \nOut: {polymer not in ortho_disc[0]}\n')
print(f'>>> {polymer} not in {ortho_disc[1]} \nOut: {polymer not in ortho_disc[1]}\n')
print(f'>>> {polymer} not in {ortho_disc[2]} \nOut: {polymer not in ortho_disc[2]}\n\n')

breakpoint = np.argwhere(polymer not in ortho_disc)
print(f'>>>np.argwhere({polymer} not in ortho_disc) \nOut: {breakpoint}\n\n\n')

Output:

>>> ortho_disc 
Out:
[[1 1 1 0 0 0 0 0 0]
 [1 0 1 1 0 0 0 0 0]
 [0 0 0 0 0 0 2 2 0]]

>>> 2 not in [1 1 1 0 0 0 0 0 0] 
Out: True

>>> 2 not in [1 0 1 1 0 0 0 0 0] 
Out: True

>>> 2 not in [0 0 0 0 0 0 2 2 0] 
Out: False


>>>np.argwhere(2 not in ortho_disc) 
Out: []

Expected output

From the bottom two lines:

breakpoint = np.argwhere(polymer not in ortho_disc)
print(f'>>>np.argwhere({polymer} not in ortho_disc) \nOut: {breakpoint}\n\n\n')

I excpect the following output:

>>>np.argwhere(2 not in ortho_disc) 
Out: [0, 1]

Summary

I would really love feedback on how to solve this issue, as I have been scratching my head over what seems to be an easy problem for ages. And as I mentioned it is important to avoid the obvious ‘easy-way-out’ loop over ortho_disc, preferably using numpy.

Thanks in advance!

>Solution :

In [13]: ortho_disc
Out[13]: 
array([[1, 1, 1, 0, 0, 0, 0, 0, 0],
       [1, 0, 1, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 2, 2, 0]])

In [14]: polymer = 2

In [15]: (ortho_disc != polymer).all(axis=1).nonzero()[0]
Out[15]: array([0, 1])

Breaking it down: ortho_disc != polymer is an array of bools:

In [16]: ortho_disc != polymer
Out[16]: 
array([[ True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True, False, False,  True]])

We want the rows that are all True; for that, we can apply the all() method along axis 1 (i.e. along the rows):

In [17]: (ortho_disc != polymer).all(axis=1)
Out[17]: array([ True,  True, False])

That’s the boolean mask for the rows that do not contain polymer.

Use nonzero() to find the indices of the values that are not 0 (True is considered nonzero, False is 0):

In [19]: (ortho_disc != polymer).all(axis=1).nonzero()
Out[19]: (array([0, 1]),)

Note that nonzero() returned a tuple with length 1; in general, it returns a tuple with the same length as the number of dimensions of the array. Here the input array is 1-d. Pull out the desired result from the tuple by indexing with [0]:

In [20]: (ortho_disc != polymer).all(axis=1).nonzero()[0]
Out[20]: array([0, 1])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading