Let’s suppose I have a np.array
like:
array([[1., 1., 0., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 0.],
[1., 1., 1., 1., 0.]])
I would like to know if there is a pythonic way to find all the columns that contain at least one occurence of 0. In the example I would like to retrieve the indexes 2 and 4.
I need to remove those columns, but I also need to know how many columns I have removed (the indexes are not strictly necessary).
So in the end I simply need the result
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
>Solution :
If you want to simply remove the columns, you can use np.all
(or its ndarray
) variant to find the columns you want to keep. Use the resulting boolean mask to index the 2nd axis:
>>> arr[:, arr.all(axis=0)]
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
If you want to find the indices of those columns with at least one zero, you can use np.any
in conjunction with np.nonzero
(or np.flatnonzero
if you prefer):
>>> np.any(arr == 0, axis=0).nonzero()
(array([2, 4], dtype=int64),)
If you want to count them, you can sum the boolean mask directly:
>>> np.any(arr == 0, axis=0).sum()
2