问题描述
下面是我的管道,似乎无法使用ModelTransformer类将参数传递给我的模型,我从链接()
Below is my pipeline and it seems that I can't pass the parameters to my models by using the ModelTransformer class, which I take it from the link (http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html)
该错误消息对我来说很有意义,但我不知道如何解决.任何想法如何解决这个问题?谢谢.
The error message makes sense to me, but I don't know how to fix this. Any idea how to fix this? Thanks.
# define a pipeline
pipeline = Pipeline([
('vect', DictVectorizer(sparse=False)),
('scale', preprocessing.MinMaxScaler()),
('ess', FeatureUnion(n_jobs=-1,
transformer_list=[
('rfc', ModelTransformer(RandomForestClassifier(n_jobs=-1, random_state=1, n_estimators=100))),
('svc', ModelTransformer(SVC(random_state=1))),],
transformer_weights=None)),
('es', EnsembleClassifier1()),
])
# define the parameters for the pipeline
parameters = {
'ess__rfc__n_estimators': (100, 200),
}
# ModelTransformer class. It takes it from the link
(http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html)
class ModelTransformer(TransformerMixin):
def __init__(self, model):
self.model = model
def fit(self, *args, **kwargs):
self.model.fit(*args, **kwargs)
return self
def transform(self, X, **transform_params):
return DataFrame(self.model.predict(X))
grid_search = GridSearchCV(pipeline, parameters, n_jobs=-1, verbose=1, refit=True)
错误消息:ValueError:估算器ModelTransformer的参数n_estimators无效.
Error Message:ValueError: Invalid parameter n_estimators for estimator ModelTransformer.
推荐答案
GridSearchCV
对嵌套对象具有特殊的命名约定.在您的情况下,ess__rfc__n_estimators
代表ess.rfc.n_estimators
,并且根据pipeline
的定义,它指向
GridSearchCV
has a special naming convention for nested objects. In your case ess__rfc__n_estimators
stands for ess.rfc.n_estimators
, and, according to the definition of the pipeline
, it points to the property n_estimators
of
ModelTransformer(RandomForestClassifier(n_jobs=-1, random_state=1, n_estimators=100)))
很明显,ModelTransformer
实例没有这种属性.
Obviously, ModelTransformer
instances don't have such property.
修复很容易:为了访问ModelTransformer
的基础对象,需要使用model
字段.因此,网格参数变为
The fix is easy: in order to access underlying object of ModelTransformer
one needs to use model
field. So, grid parameters become
parameters = {
'ess__rfc__model__n_estimators': (100, 200),
}
附言.这不是代码的唯一问题.为了在GridSearchCV中使用多个作业,您需要使正在使用的所有对象都可复制.这是通过实现方法get_params
和set_params
来实现的,您可以从 BaseEstimator
混合.
P.S. it's not the only problem with your code. In order to use multiple jobs in GridSearchCV, you need to make all objects you're using copy-able. This is achieved by implementing methods get_params
and set_params
, you can borrow them from BaseEstimator
mixin.
这篇关于(Python-sklearn)如何通过gridsearchcv将参数传递给自定义ModelTransformer类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!