Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Normalize/scale dataframe in a certain range

I have the following Dataframe:

pd.DataFrame({'DateTime': {0: Timestamp('2022-02-08 00:00:00'),
  1: Timestamp('2022-02-08 00:10:00'),
  2: Timestamp('2022-02-08 00:20:00'),
  3: Timestamp('2022-02-08 00:30:00'),
  4: Timestamp('2022-02-08 00:40:00')},
 'wind power [W]': {0: 83.9, 1: 57.2, 2: 58.2, 3: 48.0, 4: 69.5}})
             DateTime  wind power [W]
0 2022-02-08 00:00:00            83.9
1 2022-02-08 00:10:00            57.2
2 2022-02-08 00:20:00            58.2
3 2022-02-08 00:30:00            48.0
4 2022-02-08 00:40:00            69.5

As you can see, 83.9 is the maximum value in my second column and 48.0 the minimum value. I want to normalize these values in a range between 0.6 and 8.4, so that 83.9 would turn to 8.4 and 48.0 to 0.6. The rest of the numbers would fall somewhere in between.
So far I only managed to normalize the column to a range of 0-1 with the code:

df['normalized'] = (df['wind power [W]']-df['wind power [W]'].min())/(df['wind power [W]'].max()-df['wind power [W]'].min())

I don’t know how to further proceed to get these numbers in my desired range. Can someone help me, please?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

We can use MinMaxScaler to perform feature scaling, MinMaxScaler supports a parameter called feature_range which allows us to specify the desired range of the transformed data

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0.6, 8.4))
df['normalized'] = scaler.fit_transform(df['wind power [W]'].values[:, None])

Alternatively if you don’t want to use MinMaxScaler, here is a way scale data in pandas only:

w = df['wind power [W]'].agg(['min', 'max'])
norm = (df['wind power [W]'] - w['min']) / (w['max'] - w['min'])
df['normalized'] = norm * (8.4 - 0.6) + 0.6

print(df)

             DateTime  wind power [W]  normalized
0 2022-02-08 00:00:00            83.9    8.400000
1 2022-02-08 00:10:00            57.2    2.598886
2 2022-02-08 00:20:00            58.2    2.816156
3 2022-02-08 00:30:00            48.0    0.600000
4 2022-02-08 00:40:00            69.5    5.271309
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading