Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Apply str title to df columns values from dictionary values

I have a dictionary that maps column names to a function name. I have wrote a function that should capitalize the values in the df column with str.title()

import pandas as pd
 
data= [["English","john","smith","ohio","united states","","","manufacturing","National","Residental","","",""]]
df= pd.DataFrame(data,columns=['Communication_Language__c','firstName', 'lastName', 'state', 'country', 'company', 'email', 'industry', 'System_Type__c', 'AccountType', 'customerSegment', 'Existing_Customer__c', 'GDPR_Email_Permission__c'])

  Communication_Language__c firstName lastName state        country company email       industry System_Type__c AccountType customerSegment Existing_Customer__c GDPR_Email_Permission__c
0                   English      john    smith  ohio  united states                manufacturing       National  Residental
def capitalize (column,df_temp):
    if df_temp[column].notna():
        df_temp[column]=df[column].str.title()
    return df_temp

def required ():
    #somethin
    Pass

parsing_map={
"firstName":[capitalize,required],
"lastName":capitalize,
"state":capitalize,
"country": [capitalize,required],
"industry":capitalize,
"System_Type__c":capitalize,
"AccountType":capitalize,
"customerSegment":capitalize,
}

i wrote the below to achieve the str title but is there a way to apply it to the df columns without naming them all

def capitalize (column,df_temp):
    if df_temp[column].notna():
        df_temp[column]=df[column].str.title()
    return df_temp

What would be the best way to reference the dictionary function mapping to apply str.title() to all of the contents in the columns with a function "capitalize"?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

desired output

data= [["English","John","Smith","Ohio","United States","","","Manufacturing","National","Residental","","",""]]
df= pd.DataFrame(data,columns=['Communication_Language__c','firstName', 'lastName', 'state', 'country', 'company', 'email', 'industry', 'System_Type__c', 'AccountType', 'customerSegment', 'Existing_Customer__c', 'GDPR_Email_Permission__c'])

  Communication_Language__c firstName lastName state        country company email       industry System_Type__c AccountType customerSegment Existing_Customer__c GDPR_Email_Permission__c
0                   English      John    Smith  Ohio  United States                Manufacturing       National  Residental

>Solution :

Normally you would use apply for this, e.g.

cols_to_capitalize = list(parsing_map.keys())
df[cols_to_capitalize] = df[cols_to_capitalize].apply(lambda x: x.str.title())

If you want to keep your method dictionary, I would suggest that you write the methods to act on a column, not on the dataframe. Something like this:

data= [["English","john","smith","ohio","united states","","","manufacturing","National","Residental","","",""]]
df= pd.DataFrame(data,columns=['Communication_Language__c','firstName', 'lastName', 'state', 'country', 'company', 'email', 'industry', 'System_Type__c', 'AccountType', 'customerSegment', 'Existing_Customer__c', 'GDPR_Email_Permission__c'])

def capitalize(col):
    if col.notna().all():
        return col.str.title()
    return col

def required(col):
    # TODO do stuff
    return col

parsing_map={
    "firstName":[capitalize,required],
    "lastName":[capitalize],
    "state":[capitalize],
    "country": [capitalize,required],
    "industry":[capitalize],
    "System_Type__c":[capitalize],
    "AccountType":[capitalize],
    "customerSegment":[capitalize],
}


for col_name, fns in parsing_map.items():
    for fn in fns:
        df[col_name] = fn(df[col_name])

You could also pass in the full df into these methods if they need to access other columns, but still returning only the single column would make the design clearer.

But you should think carefully whether you really need to reinvent the .apply functionality.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading