Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to use pandas isin function in 2d numpy array?

I have created a 2d numpy array with 2 rows and 5 columns.

import numpy as np
import pandas as pd

arr = np.zeros((2, 5))

arr[0] = [12, 94, 4, 4, 2]
arr[1] = [1, 3, 4, 12, 46]

I have also created a dataframe with two columns col1 and col2

list1 = [1,2,3,4,5]
list2 = [2,3,4,5,6]
df = pd.DataFrame({'col1': list1, 'col2': list2})

I used pandas isin function with col1 and col2 to create a boolean value list, like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['col1'].isin(df['col2'])

output

0    False
1     True
2     True
3     True
4     True

Now I want to use these bool values to slice the 2d array that I have created before, I can do that for a single row but now for the whole 2d array at once:

print(arr[0][df['col1'].isin(df['col2'])])
print(arr[1][df['col1'].isin(df['col2'])])

output:

[94.  4.  4.  2.]
[ 3.  4. 12. 46.]

but when I do something like this:

print(arr[df['col1'].isin(df['col2'])])

But this gives the error:

IndexError: boolean index did not match indexed array along dimension 0; dimension is 2 but corresponding boolean dimension is 5

Is there a way to achieve this?

>Solution :

You should slice on the second dimension of the array:

arr[:, df['col1'].isin(df['col2'])]

output:

array([[94.,  4.,  4.,  2.],
       [ 3.,  4., 12., 46.]])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading