本文介绍了为什么LinearSVC不能进行这种简单分类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试使用scikit-learn
中的LinearSVC
对象进行以下简单分类.我试过同时使用0.10和0.14版本.使用代码:
I'm trying to do the following simple classification using the LinearSVC
object in scikit-learn
. I've tried using both version 0.10 and 0.14. Using the code:
from sklearn.svm import LinearSVC, SVC
from numpy import *
data = array([[ 1007., 1076.],
[ 1017., 1009.],
[ 2021., 2029.],
[ 2060., 2085.]])
groups = array([1, 1, 2, 2])
svc = LinearSVC()
svc.fit(data, groups)
svc.predict(data)
我得到输出:
array([2, 2, 2, 2])
但是,如果我将分类器替换为
However, if I replace the classifier with
svc = SVC(kernel='linear')
然后我得到结果
array([ 1., 1., 2., 2.])
这是正确的.有谁知道为什么使用LinearSVC
会破坏这个简单的问题?
which is correct. Does anyone know why using LinearSVC
would botch this simple problem?
推荐答案
LinearSVC
所基于的算法对输入中的极高值非常敏感:
The algorithm underlying LinearSVC
is very sensitive to extreme values in its input:
>>> svc = LinearSVC(verbose=1)
>>> svc.fit(data, groups)
[LibLinear]....................................................................................................
optimization finished, #iter = 1000
WARNING: reaching max number of iterations
Using -s 2 may be faster (also see FAQ)
Objective value = -0.001256
nSV = 4
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
intercept_scaling=1, loss='l2', multi_class='ovr', penalty='l2',
random_state=None, tol=0.0001, verbose=1)
(警告是指LibLinear FAQ,因为scikit-learn的LinearSVC
基于该库.)
(The warning refers to the LibLinear FAQ, since scikit-learn's LinearSVC
is based on that library.)
您应该先进行归一化:
>>> from sklearn.preprocessing import scale
>>> data = scale(data)
>>> svc.fit(data, groups)
[LibLinear]...
optimization finished, #iter = 39
Objective value = -0.240988
nSV = 4
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
intercept_scaling=1, loss='l2', multi_class='ovr', penalty='l2',
random_state=None, tol=0.0001, verbose=1)
>>> svc.predict(data)
array([1, 1, 2, 2])
这篇关于为什么LinearSVC不能进行这种简单分类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!