Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

X_train, y_train from transformed data

How do i obtain X_train and y_train separately after transforming the data

Code

from sklearn.pipeline import Pipeline 
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.preprocessing import StandardScaler 


DATA=pd.read_csv("/storage/emulated/0/Download/iris-write-from-docker.csv")

X = DATA.drop(["class"], axis = 'columns')
y = DATA["class"].values
        
X_train, X_test, y_train, y_test=train_test_split(X,y,test_size=0.25,random_state = 42)
                                 
pipe=Pipeline(steps=[('clf',StandardScaler())])
dta=pipe.fit_transform(X_train,y_train)

print(dta)

#print(X_train,y_train) from dta
                                        

I want to obtain transformed X_train and y_train from dta

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

The output of fit_transform() is the transformed version of X_train. y_train is not used during the fit_transform() of your pipeline.

Therefore you can simply do as follows to retrieve the transformed X_train as y_train remains the same:

pipe=Pipeline(steps=[('clf',StandardScaler())])
X_train_scaled = pipe.fit_transform(X_train)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading