Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas: Apply transformations to all character columns

I am working in Python to try and apply a few transformations to all character/string columns in a pandas dataframe. The transformations are:

  • Make everything uppercase
  • Trim the white space

I come from an R background and this can be achieved via something like


mydf <- mydf %>% 
  dplyr::mutate_if(is.character, toupper)
  dplyr::mutate_if(is.character, trimws)

For Python I am at a loss. I have tried the below where it first identifies all the character columns and then attempts to trim the whitespace and make all the character columns upper case (Species in this case)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris

# Create a sample dataset
iris = load_iris()

df= pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                 columns= iris['feature_names'] + ['target'])

df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names)

# Make character columns upper case and then trim the white space
string_dtypes = df.convert_dtypes().select_dtypes("string")
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.upper())
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.strip())

df

I appreciate this might be a very basic question and thank you in advance to anyone who takes the time to help

>Solution :

You should be able to do this in one line with method chaining:

df.astype(str).apply(lambda x: x.str.upper().str.strip())

Output:

    sepal length (cm)   sepal width (cm)    petal length (cm)   petal width (cm)    target  species
0   5.1 3.5 1.4 0.2 0.0 SETOSA
1   4.9 3.0 1.4 0.2 0.0 SETOSA
2   4.7 3.2 1.3 0.2 0.0 SETOSA
3   4.6 3.1 1.5 0.2 0.0 SETOSA
4   5.0 3.6 1.4 0.2 0.0 SETOSA
... ... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 2.0 VIRGINICA
146 6.3 2.5 5.0 1.9 2.0 VIRGINICA
147 6.5 3.0 5.2 2.0 2.0 VIRGINICA
148 6.2 3.4 5.4 2.3 2.0 VIRGINICA
149 5.9 3.0 5.1 1.8 2.0 VIRGINICA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading