I am trying to check a pandas row to see if two conditions are met, if these conditions are met I am changing on of the values of the dataframe.
import pandas as pd
d = {"age": [10, 20, 40, 20, 30, 20], "job": ["teacher", "teacher", "chef", "teacher", "doctor", "lifeguard"]}
df = pd.DataFrame(data=d)
print(df.head())
print("-"*20)
#mask = df[df["age"] == 20 and df["job"] == "teacher"]
df.loc[df["age"] == 20 and df["job"] == "teacher"] = "REPLACED!"
print(df.head())
I thought I would be able to make a boolean mask with the commented out section, but was unable to do so.
>Solution :
This is a common error. You’re doing two things wrong:
- With pandas masks, you use
&instead ofand, and|instead ofor &and|have a higher precedence than==, so you need to wrap thex == yexpression in parentheses:
df.loc[(df["age"] == 20) & (df["job"] == "teacher")] = "REPLACED!"
Output:
>>> df
age job
0 10 teacher
1 REPLACED! REPLACED!
2 40 chef
3 REPLACED! REPLACED!
4 30 doctor
5 20 lifeguard
Note that if you dislike wrapping the x == y expressions in parentheses, you can use x.eq(y) (which is pandas-specific):
df.loc[df["age"].eq(20) & df["job"].eq("teacher")] = "REPLACED!"