Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Map dataframe function without lambda

I have the following function:

def summarize(text, percentage=.6):
    import numpy as np
    sentences = nltk.sent_tokenize(text)
    sentences = sentences[:int(percentage*len(sentences))]
    summary = ''.join([str(sentence) for sentence in sentences])
    return summary

And I want to map it to dataframe rows. It works pretty well when I use the following code :

df['summary'] = df['text'].map(summarize)

However, when I want to change the percentage variable in this call, it does df['summary'] = df['text'].map(summarize(percentage=.8)), it shows an error indicating it requires another argument, which is text. Of course, it can be resolved using a lambda function as follows:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['summary'] = df['text'].map(lambda x: summarize(x, percentage=.8))

But I do not want use the lambda in the call. Is there any method to do it otherwise? For example using kwargs inside the function to refer to the text column in the dataframe? Thank you

>Solution :

Possible solution is use Series.apply instead map, then is possible add parameters without lambda like named arguments:

df['summary'] = df['text'].map(summarize, percentage=.8)

TypeError: map() got an unexpected keyword argument ‘percentage’


df['summary'] = df['text'].apply(summarize, percentage=.8)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading