我正在尝试使用Pipeline拟合一个模型:

from sklearn import cross_validation
from sklearn.linear_model import LogisticRegression
from sklearn.grid_search import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler

cross_validation_object = cross_validation.StratifiedKFold(Y, n_folds = 10)
scaler = MinMaxScaler(feature_range = [0,1])
logistic_fit = LogisticRegression()

pipeline_object = Pipeline([('scaler', scaler),('model', logistic_fit)])

tuned_parameters = [{'model__C': [0.01,0.1,1,10],
                    'model__penalty': ['l1','l2']}]

grid_search_object = GridSearchCV(pipeline_object, tuned_parameters, cv = cross_validation_object, scoring = 'accuracy')

grid_search_object.fit(X_train,Y_train)


我的问题:best_estimator是否要根据训练数据中的值来缩放测试数据?例如,如果我打电话:

grid_search_object.best_estimator_.predict(X_test)


它将不会尝试使缩放器适合X_test数据,对吗?它将仅使用原始参数对其进行转换。

谢谢!

最佳答案

predict方法从不适合任何数据。在这种情况下,正如您所描述的,best_estimator_管道将根据在训练集上获得的缩放比例进行缩放。

关于python - Pipeline中的项目什么时候调用fit_transform(),什么时候调用transform()? (scikit学习,管道),我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/25248418/

10-11 16:15