Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to apply StandardScalar to a single column?

I need to apply StandardScaler of sklearn to a single column col1 of a DataFrame:

df:

col1  col2  col3
1     0     A
1     10    C
2     1     A
3     20    B

This is how I did it:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

from sklearn.preprocessing import StandardScaler

def listOfLists(lst):
    return [[el] for el in lst]

def flatten(t):
    return [item for sublist in t for item in sublist]

scaler = StandardScaler()

df['col1'] = flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist())))

However, then I apply the inverse_transform, then it does not give me initial values of col1. Instead it returns the normalised values:

scaler.inverse_transform(flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist()))))

or:

scaler.inverse_transform(df['col1'])

>Solution :

You could fit a scaler directly on the column (since the scaler is expecting a 2D array, you can select the column as a DataFrame by df[['col1']]):

scaler = StandardScaler()
>>> arr = scaler.fit_transform(df[['col1']]).flatten()
array([-0.90453403, -0.90453403,  0.30151134,  1.50755672])

>>> scaler.inverse_transform(arr)
array([1., 1., 2., 3.])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading