Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas dataframe map using lambda function with multiple input arguments | AttributeError: 'DataFrame' object has no attribute 'map'

Given:

import numpy as np
import pandas as pd

df = pd.DataFrame(data={'user_ip': ["u1", "u2", "u3", "u4", "u5"],
                        'a':  [1, np.nan, 8, 2, 0], 
                        'b':  [2, 5, 1, np.nan, 0], 
                        'c':  [3, 0, np.nan, 0, 7],
                        'd':  [0, 2, 1, 2, 9],
                    },
                  )

 user_ip   a     b       c      d
0   u1    1.0   2.0     3.0     0
1   u2    NaN   5.0     0.0     2
2   u3    8.0   1.0     NaN     1
3   u4    2.0   NaN     0.0     2
4   u5    0.0   0.0     7.0     9

Goal:

I’d like to loop through each row to get a new column using my custom defined function with input arguments (including DataFrame and its column) as follows:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

def fcn(df, col, x, y):
    return x*df[col] + y

df["new_col_apply"] = df.apply(lambda inp_df: fcn(inp_df, col="b", x=2, y=10), axis=1)

My solution works fine but apply() method seems quite slow for my original dataframe containing more than 900K rows.

I am aware of map() but since DataFrame doesn’t have map() transformation and I specifically need to input my DataFrame and its column (col) as input to my function fcn, my following snippet:

df["new_col_map"] = df.map(lambda inp_df: fcn(inp_df, col="b", x=2, y=10), na_action="ignore")

ends up in AttributeError as bellow:

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-14-dfe11c4bff87> in <cell line: 1>()
----> 1 df["new_col_map"] = df.map(lambda inp_df: fcn(inp_df, col="b", x=2, y=10), na_action="ignore")
      2 df

/usr/local/lib/python3.10/dist-packages/pandas/core/generic.py in __getattr__(self, name)
   5900         ):
   5901             return self[name]
-> 5902         return object.__getattribute__(self, name)
   5903 
   5904     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'map'

Is there any better and faster alternative than apply() transformation to loop through large pandas DataFrame with custom defined functions with several arguments?

Cheers,

>Solution :

Try directly invoking the function:

df["new_col"] = fcn(df, col="b", x=2, y=10)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading