So I have a dataframe like this, and I want to find how many students have more than two school experiences.
df["Name","Primary school","Middle school","High School"].isnull()
| Name | Primary school | Middle school | High School |
|---|---|---|---|
| Alex | False | False | False |
| Peng | False | False | True |
| Hu | False | False | True |
We can use df.count_value() to get the column summary, but how could I get sum value of rows?
desired output
df["Enough Experience?"]= #code
| Name | Primary school | Middle school | High School | Enough experience? |
|---|---|---|---|---|
| Alex | False | False | False | True |
| Peng | False | False | True | False |
| Hu | False | False | True | False |
>Solution :
Use:
print (df)
Name Primary school Middle school High School
0 Alex 1.0 NaN NaN
1 Peng 1.0 NaN 2.0
2 Hu 2.0 7.0 0.0
3 John NaN NaN NaN
# test at least 1 missing values
cols = ["Primary school","Middle school","High School"]
df["Enough Experience?1"] = ~df[cols].isnull().any(axis=1)
# test at least 1 missing values
df["Enough Experience?2"] = df[cols].isnull().any(axis=1)
# test at least 2 missing values per rows with `sum`
df["Enough Experience?3"] = df[cols].isnull().sum(axis=1).lt(2)
df["Enough Experience?4"] = df[cols].count(axis=1).ge(2)
print (df)
Name Primary school Middle school High School Enough Experience?1 \
0 Alex 1.0 NaN NaN False
1 Peng 1.0 NaN 2.0 False
2 Hu 2.0 7.0 0.0 True
3 John NaN NaN NaN False
Enough Experience?2 Enough Experience?3 Enough Experience?4
0 True False False
1 True True True
2 False True True
3 True False False