Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Mutate multiple pandas dataframe inplace using a function

I would like to write a function that takes multiple dataframes that have the same structure, does specific transformations and saves the transformations inplace.

Dummy dataframes

df = pd.DataFrame({"Full name" : ["John Doe","Deep Smith","Julia Carter","Kate Newton","Sandy Thompson"], 
                     "Monthly Sales" : [25,30,35,40,45]}) 

df2 = pd.DataFrame({"Full name" : ["Alicia Williams","Kriten John","Jessica Adams","Isaac Newton","Whitney Gordon"], 
                     "Monthly Sales" : [35,20,50,15,40]})

Transformative function

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I don’t want to return the dataframe, but rather save those transformations in place.

def tidy_dfs(dfs):
    for df in dfs:
        # Drop first row
        df = df.iloc[1: , :]
        # Replace spaces in columns
        df.columns = [c.replace(' ', '_') for c in df]
        # change cols to lower
        df.columns = [c.lower() for c in df]
    return df

saving df,df2 = tidy_dfs([df,df2]) of course won’t work as we’re outside the loop.

Results
What would be a way to call this function and save the transformation inplace?

tidy_dfs([df,df2])

>Solution :

EDIT: If pass list of DataFrames, you can return another list (out) or modify existing list dfs. So not possible inplace list of DataFrame without assign back like last step.

Your function not return list of DataFrame, so you need create empty list and append cleaned DataFrame:

def tidy_dfs(dfs):
    out = []
    for df in dfs:
        # Drop first row
        df = df.iloc[1: , :]
        # Replace spaces in columns
        df.columns = [c.replace(' ', '_') for c in df]
        # change cols to lower
        df.columns = [c.lower() for c in df]
        out.append(df)
    return out

df,df2 = tidy_dfs([df,df2])

For inplace operations:

def tidy_dfs(dfs):
    for df in dfs:
        # Drop first row
        df.drop(df.index[0], inplace=True)
        # Replace spaces in columns and lowercase
        df.rename(columns = lambda x: x.replace(' ', '_').lower(), inplace=True)

    return dfs

df, df2 = tidy_dfs([df,df2])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading