X_train, y_train from transformed data

December 1, 2021

How do i obtain X_train and y_train separately after transforming the data

Code

from sklearn.pipeline import Pipeline 
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.preprocessing import StandardScaler 


DATA=pd.read_csv("/storage/emulated/0/Download/iris-write-from-docker.csv")

X = DATA.drop(["class"], axis = 'columns')
y = DATA["class"].values
        
X_train, X_test, y_train, y_test=train_test_split(X,y,test_size=0.25,random_state = 42)
                                 
pipe=Pipeline(steps=[('clf',StandardScaler())])
dta=pipe.fit_transform(X_train,y_train)

print(dta)

#print(X_train,y_train) from dta

I want to obtain transformed X_train and y_train from dta

>Solution :

The output of fit_transform() is the transformed version of X_train. y_train is not used during the fit_transform() of your pipeline.

Therefore you can simply do as follows to retrieve the transformed X_train as y_train remains the same:

pipe=Pipeline(steps=[('clf',StandardScaler())])
X_train_scaled = pipe.fit_transform(X_train)

scikit-learn

byMR

Published December 01, 2021

Add a comment

Is it implementation-defined that how to deal with [[no_unique_address]]?

byMR

December 1, 2021

Questions

Create two columns in spark DataFrame with one for cumulative value and another for maximum continuous value

byMR

December 1, 2021

Questions

How to connect input layer to an extra layer in Tensorflow

byMR

December 1, 2021

Questions

Python sum up array based on key

byMR

December 1, 2021

Questions

Failed to delete the a comment on post?

byMR

December 1, 2021

Questions

No reference to glib functions

byMR

December 1, 2021

X_train, y_train from transformed data