Home Iterating over rows of a dataframe, and assigning multiple calculated values to the rows

Questions

Iterating over rows of a dataframe, and assigning multiple calculated values to the rows

June 1, 2022

I have a df:

dict1 = {'A': 1, 'B': 2, 'C': 3, 'D': 4}
dict2 = {'A': 10, 'B': 20, 'C': 30, 'D': 40}
dict3 = {'A': 100, 'B': 200, 'C': 300, 'D': 400}
df = pd.DataFrame([dict1, dict2, dict3])

(I’m working from home, I can’t copy paste the output here, sorry)

Now, I would like to ‘enlarge’ df, then assign calculated values to the new columns.

df[['new_col1', 'new_col2']] = None
for idx, row in df.iterrows():
    # insert the calculated values for `new_col1` and `new_col2` here

I think I do need to iterate over the rows, as the calculation is based on the values of the rows. I can of course manually insert the values for each cell one by one using .at, but I have hundreds of thousands of rows, and ~20 calculated values to fill in. How can I do this?

I tried:

dictt = {'new_col1': 1, 'new_col2': 2}
df.iloc[0] = df.iloc[0].map(dictt)

But then if I check what df.iloc[0] is, its a row of NaN. I also tried:

df.iloc[0] = df.iloc[0].replace(dictt)

But that didn’t do anything. Also, if there is a better/ more proper way to do operations like this, I’m all ears.

>Solution :

If you have some heavy complicated function main bottleneck is in this function, not in pandas, here is solution how iterate in DataFrame.apply:

def f(a, b):
    return pd.Series({'new_col1': 1 + a, 'new_col2': 2 + b})

df = df.join(df.apply(lambda x: f(x.A, x.B), axis=1))
print (df)
     A    B    C    D  new_col1  new_col2
0    1    2    3    4         2         4
1   10   20   30   40        11        22
2  100  200  300  400       101       202

Another idea:

def f(a, b):
    return (1 + a,  2 + b)

df[['col1','col2']] = df.apply(lambda x: f(x.A, x.B), axis=1, result_type='expand')
print (df)
     A    B    C    D  col1  col2
0    1    2    3    4     2     4
1   10   20   30   40    11    22
2  100  200  300  400   101   202