问题描述
在用scikit-learn训练了SVM分类器之后,我需要alpha值,这是SVM对偶问题的拉格朗日乘数.根据该文档,似乎scikit-learn仅提供svm.dual_coef_
,这是拉格朗日乘数alpha与数据点标签的乘积.
I need the alpha values, which are the Lagrange multipliers of the SVM dual problem, after training a SVM classifier with scikit-learn. According to the document, it seems that scikit-learn provides only svm.dual_coef_
, which is the product of the Lagrange multiplier alpha and the label of a data point.
我尝试通过将svm.dual_coef_
的元素除以数据标签来手动计算alpha值,但是由于svm.dual_coef_
仅存储支持向量的系数,因此我不确定是否对这个数组进行迭代,支持向量的顺序与原始训练数据中的顺序相同.
I tried to calculate the alpha value manually by dividing the elements of svm.dual_coef_
by the data label, but since svm.dual_coef_
stores only the coefficients of the support vectors, I'm not sure if I iterate over this array, the order of support vectors would be the same as the order in the original training data.
那么有没有一种可靠的方法来获取支持向量的alpha值?
So is there a reliable way to get the alpha values of support vectors?
推荐答案
由于alpha值按定义是正数,因此可以通过使用double_coefs来获取它:
As alpha values are by definition positive you can get it through taking abs of dual_coefs:
alphas = np.abs(svm.dual_coef_)
whis是以下事实的直接结果
whis is a direct consequence of the fact that
svm.dual_coef_[i] = labels[i] * alphas[i]
其中labels[i]
是-1
或+1
,而alphas[i]
始终为正.此外,您还可以通过获得每个标签
where labels[i]
is either -1
or +1
and alphas[i]
are always positive. Futhermore, you can also get each label through
labels = np.sign(svm.dual_coef_)
使用相同的观察结果.这也是为什么scikit-learn不会这样存储alpha的原因-它们由dual_coefs_以及标签唯一表示.
using the same observation. This is also why scikit-learn does not store alphas as such - they are uniquely represented by dual_coefs_, together with labels.
分析所有可能的情况后,很容易理解它:
It is easy to understand it once you analyze all possible cases:
-
labels[i] == -1
和alphas[i] > 0
=>dual_coef_[i] < 0
和dual_coef_[i] == -alphas[i] == labels[i] * alphas[i]
-
labels[i] == -1
和alphas[i] < 0
=> 不可能(字母为非负数) -
labels[i] == -1
和alphas[i]== 0
=> 它不是支持向量 -
labels[i] == +1
和alphas[i] > 0
=>dual_coef_[i] > 0
和dual_coef_[i] == alphas[i] == labels[i] * alphas[i]
-
labels[i] == +1
和alphas[i] < 0
=> 不可能(字母为非负数) -
labels[i] == +1
和alphas[i]== 0
=> 它不是支持向量
labels[i] == -1
andalphas[i] > 0
=>dual_coef_[i] < 0
anddual_coef_[i] == -alphas[i] == labels[i] * alphas[i]
labels[i] == -1
andalphas[i] < 0
=> impossible (alphas are non-negative)labels[i] == -1
andalphas[i]== 0
=> it is not a support vectorlabels[i] == +1
andalphas[i] > 0
=>dual_coef_[i] > 0
anddual_coef_[i] == alphas[i] == labels[i] * alphas[i]
labels[i] == +1
andalphas[i] < 0
=> impossible (alphas are non-negative)labels[i] == +1
andalphas[i]== 0
=> it is not a support vector
因此,如果dual_coef_[i]
为正,则它是alphas[i]
系数,属于正类,如果dual_coef_[i]
为负,则alphas[i] == -dual_coef_[i]
属于负类.
Consequently, if dual_coef_[i]
is positive then it is the alphas[i]
coefficient, and it belongs to positive class, and if it is negative, alphas[i] == -dual_coef_[i]
and it belongs to negative class.
这篇关于如何获取scikit-learning SVM分类器的所有alpha值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!