looping and with if statement over dataframe

March 22, 2022

I’m running into an issue when iterating over rows in a pandas data frame

this is the code I am trying to run

data = {'test':[1,1,0,0,3,1,0,3,0],
                'test2':[0, 2, 0,1,1,2,7,3,2],
                }
df = pd.DataFrame(data)
df['combined'] = df['test'] +df['test2']
df['combined'].astype('float64')
df
    
for index, row in df.iterrows():
    if row['test']>=1 & row['test2']>=1:
        row['combined']/=2
    else:
        pass

so, it should divide by 2 if both test and test2 have a value of 1 or more, however it doesn’t divide all the rows that should be divided.

am I making a mistake somewhere?

this is the outcome when I run the code
corresponding columns are test, test2 and combined

0   1   0   1
1   1   2   3
2   0   0   0
3   0   1   1
4   3   1   2
5   1   2   3
6   0   7   7
7   3   3   3
8   0   2   2

>Solution :

What you are doing is in general a bad practice as iterating the rows should be avoided for performance reasons if is not strictly necessary, the solution is defining mask with your conditions and operate within the mask using .loc:

data = {'test':[1,1,0,0,3,1,0,3,0],
                'test2':[0, 2, 0,1,1,2,7,3,2],
                }
df = pd.DataFrame(data)
df['combined'] = df['test'] +df['test2']
df['combined'].astype('float64')
mask = (df['test']>=1) & (df['test2']>=1)
df.loc[mask,'combined'] /=2