I would like to create a new column (D) that contains only the data from column (B) that satisfy the boolean condition from column (C), with "NA" for the false condition.
For example, here is my original table.
| A | B | C |
|---|---|---|
| 1 | car | True |
| 2 | car | True |
| 3 | bike | False |
| 4 | house | False |
I’d like to use the boolean column (C) to subset column (B) into a new column (D) like so..
| A | B | C | D |
|---|---|---|---|
| 1 | car | True | car |
| 2 | car | True | car |
| 3 | bike | False | NA |
| 4 | horse | False | NA |
>Solution :
Use df['C'] as a condition in np.where:
df['D'] = np.where(df['C'], df['B'], np.nan)
Output:
A B C D
0 1 car True car
1 2 car True car
2 3 bike False NaN
3 4 house False NaN