Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Iterating over rows of a dataframe, and assigning multiple calculated values to the rows

I have a df:

dict1 = {'A': 1, 'B': 2, 'C': 3, 'D': 4}
dict2 = {'A': 10, 'B': 20, 'C': 30, 'D': 40}
dict3 = {'A': 100, 'B': 200, 'C': 300, 'D': 400}
df = pd.DataFrame([dict1, dict2, dict3])

(I’m working from home, I can’t copy paste the output here, sorry)

Now, I would like to ‘enlarge’ df, then assign calculated values to the new columns.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df[['new_col1', 'new_col2']] = None
for idx, row in df.iterrows():
    # insert the calculated values for `new_col1` and `new_col2` here

I think I do need to iterate over the rows, as the calculation is based on the values of the rows. I can of course manually insert the values for each cell one by one using .at, but I have hundreds of thousands of rows, and ~20 calculated values to fill in. How can I do this?

I tried:

dictt = {'new_col1': 1, 'new_col2': 2}
df.iloc[0] = df.iloc[0].map(dictt)

But then if I check what df.iloc[0] is, its a row of NaN. I also tried:

df.iloc[0] = df.iloc[0].replace(dictt)

But that didn’t do anything. Also, if there is a better/ more proper way to do operations like this, I’m all ears.

>Solution :

If you have some heavy complicated function main bottleneck is in this function, not in pandas, here is solution how iterate in DataFrame.apply:

def f(a, b):
    return pd.Series({'new_col1': 1 + a, 'new_col2': 2 + b})

df = df.join(df.apply(lambda x: f(x.A, x.B), axis=1))
print (df)
     A    B    C    D  new_col1  new_col2
0    1    2    3    4         2         4
1   10   20   30   40        11        22
2  100  200  300  400       101       202

Another idea:

def f(a, b):
    return (1 + a,  2 + b)

df[['col1','col2']] = df.apply(lambda x: f(x.A, x.B), axis=1, result_type='expand')
print (df)
     A    B    C    D  col1  col2
0    1    2    3    4     2     4
1   10   20   30   40    11    22
2  100  200  300  400   101   202
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading