Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python – lambda on pandas dataframe with nan rows

I want to apply an alteration to a column of my dataframe where the cells are not empty. This is the dataframe that I am using:

df = pd.DataFrame ([{'name':None, 'client':None, 'fruit':'orange'},
                    {'name':'halley','client':'abana', 'fruit':'pear'},
                    {'name':'josh','client':'a', 'fruit':'apple'},
                    {'name':'kim','client':'b', 'fruit':'apple'}])

output:

   name    client fruit
0  nan     nan    orange
1  halley  abana  pear
2  josh    a      apple
3  kim     b      apple

I want to rename clients with string shorter than 5 characters to be ‘client_x’ instead and this is what I did:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['client'] =df['client'].apply(lambda x: x if len(x)>5 else "client_"+x)

but I have witnessed the following two possible errors:

TypeError: object of type 'float' has no len()
TypeError: object of type 'NoneType' has no len()

I don’t understand how nan can be assumed as a float, but I would really like a smart way to get through this.

Any help would be greatly appreciated!!

>Solution :

Use Series.str.len for working with missing values NaNs with numpy.where:

df['client'] = np.where(df['client'].str.len()>=5, df['client'], "client_"+df['client'])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading