我正在使用Stochastic Gradient Descent
http://scikit-learn.org/stable/modules/sgd.html中的scikit-learn
。链接中给出的示例如下所示:
>>> from sklearn.linear_model import SGDClassifier
>>> X = [[0., 0.], [1., 1.]]
>>> y = [0, 1]
>>> clf = SGDClassifier(loss="hinge", penalty="l2")
>>> clf.fit(X, y)
SGDClassifier(alpha=0.0001, class_weight=None, epsilon=0.1, eta0=0.0,
fit_intercept=True, l1_ratio=0.15, learning_rate='optimal',
loss='hinge', n_iter=5, n_jobs=1, penalty='l2', power_t=0.5,
random_state=None, rho=None, shuffle=False, verbose=0,
warm_start=False)
>>> clf.coef_
array([[ 9.91080278, 9.91080278]])
如果我对提到的here数据集执行此操作,那么我会出错。以下是我在做什么以及我得到的错误:
>>> X = np.array([[41.9,43.4,43.9,44.5,47.3,47.5,47.9,50.2,52.8,53.2,56.7,57.0,63.5,65.3,71.1,77.0,77.8], [29.1,29.3,29.5,29.7,29.9,30.3,30.5,30.7,30.8,30.9,31.5,31.7,31.9,32.0,32.1,32.5,32.9]])
>>> Y = np.array([251.3,251.3,248.3,267.5,273.0,276.5,270.3,274.9,285.0,290.0,297.0,302.5,304.5,309.3,321.7,330.7,349.0]).reshape((17,1))
>>> from sklearn.linear_model import SGDClassifier
>>> n = np.max(X.shape)
>>> XS = np.vstack([np.ones(n), X]).T
>>> clf = SGDClassifier(loss="hinge", penalty="l2")
>>> clf.fit(XS, Y)
/usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py:322: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 485, in fit
sample_weight=sample_weight)
File "/usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 389, in _fit
classes, sample_weight, coef_init, intercept_init)
File "/usr/local/lib/python2.6/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 328, in _partial_fit
_check_partial_fit_first_call(self, classes)
File "/usr/local/lib/python2.6/dist-packages/sklearn/utils/multiclass.py", line 323, in _check_partial_fit_first_call
clf.classes_ = unique_labels(classes)
File "/usr/local/lib/python2.6/dist-packages/sklearn/utils/multiclass.py", line 94, in unique_labels
raise ValueError("Unknown label type")
ValueError: Unknown label type
有人可以告诉我我在做什么错吗?我也对python中
gradient descent
的其他实现持开放态度。 最佳答案
您正在混淆分类和回归,从输出值(“标签”,Y)来看,您正在尝试进行回归(输出是实数),而SGDClassifier
(顾名思义)是分类工具。请改用SGDRegressor
。