在scikit-learn中具有BaseEstimator的GradientBoostingClassifier?

本文介绍了在scikit-learn中具有BaseEstimator的GradientBoostingClassifier?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我尝试在scikit-learn中使用GradientBoostingClassifier，它的默认参数可以正常工作.但是，当我尝试用其他分类器替换BaseEstimator时，它不起作用，并给了我以下错误，

I tried to use GradientBoostingClassifier in scikit-learn and it works fine with its default parameters. However, when I tried to replace the BaseEstimator with a different classifier, it did not work and gave me the following error,

return y - np.nan_to_num(np.exp(pred[:, k] -
IndexError: too many indices

您对此问题有任何解决办法吗?

Do you have any solution for the problem.

可以使用以下代码片段重新生成该错误:

This error can be regenerated using the following snippets:

import numpy as np
from sklearn import datasets
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.utils import shuffle

mnist = datasets.fetch_mldata('MNIST original')
X, y = shuffle(mnist.data, mnist.target, random_state=13)
X = X.astype(np.float32)
offset = int(X.shape[0] * 0.01)
X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

### works fine when init is None
clf_init = None
print 'Train with clf_init = None'
clf = GradientBoostingClassifier( (loss='deviance', learning_rate=0.1,
                             n_estimators=5, subsample=0.3,
                             min_samples_split=2,
                             min_samples_leaf=1,
                             max_depth=3,
                             init=clf_init,
                             random_state=None,
                             max_features=None,
                             verbose=2,
                             learn_rate=None)
clf.fit(X_train, y_train)
print 'Train with clf_init = None is done :-)'

print 'Train LogisticRegression()'
clf_init = LogisticRegression();
clf_init.fit(X_train, y_train);
print 'Train LogisticRegression() is done'

print 'Train with clf_init = LogisticRegression()'
clf = GradientBoostingClassifier(loss='deviance', learning_rate=0.1,
                             n_estimators=5, subsample=0.3,
                             min_samples_split=2,
                             min_samples_leaf=1,
                             max_depth=3,
                             init=clf_init,
                             random_state=None,
                             max_features=None,
                             verbose=2,
                             learn_rate=None)
 clf.fit(X_train, y_train) # <------ ERROR!!!!
 print 'Train with clf_init = LogisticRegression() is done'

这是错误的完整回溯:

Traceback (most recent call last):
File "/home/mohsena/Dropbox/programing/gbm/gb_with_init.py", line 56, in <module>
   clf.fit(X_train, y_train)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 862, in fit
   return super(GradientBoostingClassifier, self).fit(X, y)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 614, in fit random_state)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 475, in _fit_stage
   residual = loss.negative_gradient(y, y_pred, k=k)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/gradient_boosting.py", line 404, in negative_gradient
   return y - np.nan_to_num(np.exp(pred[:, k] -
   IndexError: too many indices

earn中具有BaseEstimator的GradientBoo

在scikit-learn中具有BaseEstimator的GradientBoostingClassifier?

问题描述

推荐答案