我正在使用Stochastic Gradient Descent http://scikit-learn.org/stable/modules/sgd.html中的scikit-learn。链接中给出的示例如下所示:

>>> from sklearn.linear_model import SGDClassifier
>>> X = [[0., 0.], [1., 1.]]
>>> y = [0, 1]
>>> clf = SGDClassifier(loss="hinge", penalty="l2")
>>> clf.fit(X, y)
    SGDClassifier(alpha=0.0001, class_weight=None, epsilon=0.1, eta0=0.0,
   fit_intercept=True, l1_ratio=0.15, learning_rate='optimal',
   loss='hinge', n_iter=5, n_jobs=1, penalty='l2', power_t=0.5,
   random_state=None, rho=None, shuffle=False, verbose=0,
   warm_start=False)
>>> clf.coef_
   array([[ 9.91080278,  9.91080278]])


如果我对提到的here数据集执行此操作,那么我会出错。以下是我在做什么以及我得到的错误:

 >>> X = np.array([[41.9,43.4,43.9,44.5,47.3,47.5,47.9,50.2,52.8,53.2,56.7,57.0,63.5,65.3,71.1,77.0,77.8], [29.1,29.3,29.5,29.7,29.9,30.3,30.5,30.7,30.8,30.9,31.5,31.7,31.9,32.0,32.1,32.5,32.9]])
 >>> Y = np.array([251.3,251.3,248.3,267.5,273.0,276.5,270.3,274.9,285.0,290.0,297.0,302.5,304.5,309.3,321.7,330.7,349.0]).reshape((17,1))
 >>> from sklearn.linear_model import SGDClassifier
 >>> n = np.max(X.shape)
 >>> XS = np.vstack([np.ones(n), X]).T
 >>> clf = SGDClassifier(loss="hinge", penalty="l2")
 >>> clf.fit(XS, Y)
   /usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py:322: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
    y = column_or_1d(y, warn=True)
    Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 485, in fit
sample_weight=sample_weight)
     File "/usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 389, in _fit
classes, sample_weight, coef_init, intercept_init)
     File "/usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 328, in _partial_fit
_check_partial_fit_first_call(self, classes)
     File "/usr/local/lib/python2.6/dist-packages/sklearn/utils/multiclass.py", line 323, in _check_partial_fit_first_call
clf.classes_ = unique_labels(classes)
     File "/usr/local/lib/python2.6/dist-packages/sklearn/utils/multiclass.py", line 94, in unique_labels
     raise ValueError("Unknown label type")
     ValueError: Unknown label type


有人可以告诉我我在做什么错吗?我也对python中gradient descent的其他实现持开放态度。

最佳答案

您正在混淆分类和回归,从输出值(“标签”,Y)来看,您正在尝试进行回归(输出是实数),而SGDClassifier(顾名思义)是分类工具。请改用SGDRegressor

07-24 09:53