My aim is to put values in column D based on columns A to C. I want to go through each of the columns A, B and C and add 1 to column D if the value is greater than 20
import pandas as pd
data={'A':[5,2,25,4],"B":[15,22,100,24], "C":[4, 100, 0, 19], "D" : [0,0,0,0]}
df= pd.DataFrame(data)
for x in range(0, len(df.columns)):
if df.iloc[:, x] > 20:
df["D"] = df["D"] +1
else:
df["D"] = df["D"]
df
So, based on the sample code, Column D should look like
| D |
|---|
| 0 |
| 2 |
| 2 |
| 1 |
However, i keep getting this error – ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I have gone through other answers here (Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()) which indicate that it is a multiple operator issue but I am not using these here.
what I am doing wrong?
>Solution :
What is wrong is that you can’t use if Series > value in a vectorial way. if expects a single boolean value and df.iloc[:, x] > 20 returns a Series of booleans.
In your case use:
df['D'] = df.drop(columns='D').gt(20).sum(1)
NB. If column D is initially null or inexistent, skip the .drop(columns='D').
Output:
A B C D
0 5 15 4 0
1 2 22 100 2
2 25 100 0 2
3 4 24 19 1