I have a dataframe as follow:
| Index | Value | Condition1 | Condition2 |
|---|---|---|---|
| 1 | 1 | True | False |
| 2 | 5 | False | False |
| 3 | 3 | False | True |
| 4 | 3 | False | False |
| 5 | 3 | True | False |
| 6 | 6 | False | True |
I wish to check when the row with condition 2 is true. Populate the new column with the last previous row with condition 1 to be true. Example of desired output:
| Index | Value | Condition1 | Condition2 | New Column |
|---|---|---|---|---|
| 1 | 1 | True | False | None |
| 2 | 5 | False | False | None |
| 3 | 3 | False | True | 1 |
| 4 | 3 | False | False | None |
| 5 | 3 | True | False | None |
| 6 | 6 | False | True | 3 |
I tried using tail() with np.where():
df["New Column"] = np.where(df["Condition 2"]==True,df[df["Condition 1"]==True].["Value"].tail(1),None)
However, this gave a syntax error. Since the data frame size may get bigger, I hope to achieve this without using a loop method as well.
Appreciate any help and advice!
>Solution :
You can use masks and where combined with ffill:
df['New Column'] = (df['Value'].where(df['Condition1']).ffill()
.where(df['Condition2'])
)
Variant with boolean indexing:
df.loc[df['Condition2'], 'New Column'] = df['Value'].where(df['Condition1']).ffill()
Output:
Index Value Condition1 Condition2 New Column
0 1 1 True False NaN
1 2 5 False False NaN
2 3 3 False True 1.0
3 4 3 False False NaN
4 5 3 True False NaN
5 6 6 False True 3.0
Intermediates:
Index Value Condition1 Condition2 where (condition1) ffill where (condition2)
0 1 1 True False 1.0 1.0 NaN
1 2 5 False False NaN 1.0 NaN
2 3 3 False True NaN 1.0 1.0
3 4 3 False False NaN 1.0 NaN
4 5 3 True False 3.0 3.0 NaN
5 6 6 False True NaN 3.0 3.0