本文介绍了如何将 FunctionTransformer 与 GridSearchCV 一起制作成管道?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本上,我想将列索引视为超参数.然后调整这个超参数以及管道中的其他模型超参数.在下面的示例中,col_idx 是我的超参数.我自己定义了一个名为log_columns的函数,它可以对某些列进行日志转换,该函数可以传入FunctionTransformer.然后将 FunctionTransformer 和模型放入管道中.

Basically, I want to treat the column index as a hyperparameter. Then tune this hyperparameter along with other model hyperparameters in the pipeline. In my example below, the col_idx is my hyperparameter. I self-defined a function called log_columns that can perform log transformation on certain columns and the function can be passed into FunctionTransformer. Then put FunctionTransformer and model into the pipeline.

from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.datasets import load_digits
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import FunctionTransformer


def log_columns(X, col_idx = None):
    log_func = np.vectorize(np.log)
    if col_idx is None:
        return X
    for idx in col_idx:
        X[:,idx] = log_func(X[:,idx])
    return X

pipe = make_pipeline(FunctionTransformer(log_columns, ), PCA(), SVC())
param_grid = dict(functiontransformer__col_idx = [None, [1]],
              pca__n_components=[2, 5, 10],
              svc__C=[0.1, 10, 100],
              )

grid_search = GridSearchCV(pipe, param_grid=param_grid)
digits = load_digits()

res = grid_search.fit(digits.data, digits.target)

然后,我收到以下错误消息:

Then, I received the following error message:

ValueError: Invalid parameter col_idx for estimator
FunctionTransformer(accept_sparse=False, check_inverse=True,
      func=<function log_columns at 0x1764998c8>, inv_kw_args=None,
      inverse_func=None, kw_args=None, pass_y='deprecated',
      validate=None). Check the list of available parameters with
`estimator.get_params().keys()`.

我不确定 FunctionTransformer 是否允许我做我期望的事情.如果没有,我很想知道其他优雅的方法.谢谢!

I am not sure if FunctionTransformer allows me to do what I expected. If not, I am eager to know other elegant methods. Thanks!

推荐答案

col_idx 不是 FunctionTransformer 类的有效参数,而是 kw_args是.kw_argsfunc 附加关键字参数的字典.在你的情况下,唯一的关键字参数是 col_idx.

col_idx is not a valid parameter for FunctionTransformer class, but kw_args is.kw_args is a dictionary of additional keyword arguments of func. In your case,the only keyword argument is col_idx.

试试这个:

param_grid = dict(
    functiontransformer__kw_args=[
        {'col_idx': None},
        {'col_idx': [1]}
    ],
    pca__n_components=[2, 5, 10],
    svc__C=[0.1, 10, 100],
)

这篇关于如何将 FunctionTransformer 与 GridSearchCV 一起制作成管道?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:47