python - SVC MultiClass分类OVO决策功能说明

我试图了解在使用One Vs One方法的多类分类方案中如何解释Decision_function值。我创建了一个2D样本数据，每个样本有4类的100个样本，这些数据存储在X和y变量中。这是代码：

# 4 Classes - Make 4 separate datasets
d1, o1 = make_blobs(n_samples = 100, n_features = 2, centers = 1, random_state=0, cluster_std = 0.5)
d2, o2 = make_blobs(n_samples = 100, n_features = 2, centers = 1, cluster_std = 0.5)
d3, o3 = make_blobs(n_samples = 100, n_features = 2, centers = 1, cluster_std = 0.5)
d4, o4 = make_blobs(n_samples = 100, n_features = 2, centers = 1, cluster_std = 0.5)
X = np.vstack((d1,d2,d3,d4))
y = np.hstack((np.repeat(0,100), np.repeat(1,100), np.repeat(2,100), np.repeat(3,100))).T
print('0 - Red, 1 - Green, 2 - Blue, 3 - Yellow')
cols = np.hstack((np.repeat('r',100), np.repeat('g',100), np.repeat('b',100), np.repeat('y',100))).T
svm_ovr = SVC(kernel='linear', gamma='auto', decision_function_shape='ovr')
svm_ovr.fit(X, y)

svm_ovo = SVC(kernel='linear', gamma='auto', decision_function_shape='ovo')
svm_ovo.fit(X, y)

print('OVR Configuration Costs for 4 Class Classification Data:')
print('Cost: ' + str(svm_ovr.decision_function([[2,2]])))
print('Prediction: ' + str(svm_ovr.predict([[2,2]])))
print('No. Support Vectors: ' + str(svm_ovr.n_support_))
print('OVO Configuration Costs for 4 Class Classification Data:')
print('Cost: ' + str(svm_ovo.decision_function([[2,2]])))
print('Prediction: ' + str(svm_ovo.predict([[2,2]])))
print('No. Support Vectors: ' + str(svm_ovo.n_support_))

该代码段的输出为：

OVR Configuration Costs for 4 Class Classification Data:
Cost: [[ 3.23387565  0.77664387 -0.17878109  2.15179802]]
Prediction: [0]
No. Support Vectors: [2 4 1 3]
OVO Configuration Costs for 4 Class Classification Data:
Cost: [[ 0.68740472  0.77724567  0.88685872  0.14910583 -1.49263233 -0.23041644]]
Prediction: [0]

我猜想在OVR场景中，相对于其余模型，最高成本值为3.23。这表明预测应为0。
您能否解释一下，SVC如何基于OVO案例中6个模型的成本值将测试点的类别预测为0。

最佳答案

在OVO模型中，为每个可能的类对构建一个二进制分类器，从而导致构建nC2模型，其中n是您的情况下4的类总数（因此，构建型号）。

在6中，模型按字典顺序构建，即

  型号1 = 1级vs 2级
  模型2 = 1类与3类
  模型3 = 1类与4类
  型号4 = 2级vs 3级
  型号5 = 2级vs 4级
  型号6 = 3级vs 4级

因此，决策函数是一个OVO维数组，每个元素都对应于该点与该模型的分离超平面的距离，并且负值表示该点位于超平面的另一侧：

Cost: [[ 0.68740472  0.77724567  0.88685872  0.14910583 -1.49263233 -0.23041644]]

因此，使用决策函数数组，您可以基于该值对每个模型进行预测。因此，您的预测如下：

  模型1 = 1类
  模型2 = 1类
  模型3 = 1级
  模型4 = 2级
  模型5 = 4级
  模型6 = 4级

现在，您只需对模型进行多数表决，并将其作为预测，在这种情况下，结果将成为类6。

希望这可以帮助！

关于python - SVC MultiClass分类OVO决策功能说明，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/59497115/