Follow

Follow

Contact

Home Numpy np.where condition with multiple columns

Questions

Numpy np.where condition with multiple columns

byMR

March 17, 2023

I have a dataframe

import pandas as pd
import numpy as np

data = pd.DataFrame({"col1": [0, 1, 1, 1,1, 0],
                     "col2": [False, True, False, False, True, False]
                     })

data

I’m trying to create a column col3 where col1=1 and col2==True its 1 else 0

Using np.where:

data.assign(col3=np.where(data["col1"]==1 & data["col2"], 1, 0))

col1    col2    col3
0   0   False   1
1   1   True    1
2   1   False   0
3   1   False   0
4   1   True    1
5   0   False   1

For row 1: col1==0 & col2=False, but I’m getting col3 as 1.

What am I missing??

The desired output:


col1    col2    col3
0   0   False   0
1   1   True    1
2   1   False   0
3   1   False   0
4   1   True    1
5   0   False   0

>Solution :

You are missing parentheses (& has higher precedence than ==):

data.assign(col3=np.where((data["col1"]==1) & data["col2"], 1, 0))

A way to avoid this is to use eq:

data.assign(col3=np.where(data["col1"].eq(1) & data["col2"], 1, 0))

You can also replace the numpy.where by astype:

data.assign(col3=((data["col1"]==1) & data["col2"]).astype(int))

Output:

   col1   col2  col3
0     0  False     0
1     1   True     1
2     1  False     0
3     1  False     0
4     1   True     1
5     0  False     0

numpy

byMR

Published March 17, 2023

Add a comment

Leave a ReplyCancel reply

Read more

Questions

Trying to define a getter on custom HTMLInputElement

byMR

March 17, 2023

Questions

sending props to components on separate pages

byMR

March 17, 2023

Questions

Converting iterrows into itertuples and accessing namedtuples

byMR

March 17, 2023

Questions

Hide '0' values (zero counts) in plots utilizing geom_count

byMR

March 17, 2023

Questions

Hide '0' values (zero counts) in plots utilizing geom_count

byMR

March 17, 2023

Questions

How can I speed up this for loop? C++

byMR

March 17, 2023