在libsvm中使用预计算的内核

本文介绍了在libsvm中使用预计算的内核的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在对具有不同图像描述符的图像进行分类.由于它们具有自己的指标，因此我使用的是预先计算的内核.因此，鉴于这些NxN内核矩阵(总共N个图像)，我想训练和测试SVM.我对使用SVM并不是很有经验.

I'm currently working on classifying images with different image-descriptors. Since they have their own metrics, I am using precomputed kernels. So given these NxN kernel-matrices (for a total of N images) i want to train and test a SVM. I'm not very experienced using SVMs though.

让我感到困惑的是如何输入用于训练的输入.使用内核MxM的子集(M是训练图像的数量)来训练具有M个功能的SVM.但是，如果我理解正确的话，这将限制我使用功能相似的测试数据.尝试使用大小为MxN的子内核会导致训练过程中出现无限循环，因此，在测试结果不佳时使用更多功能.

What confuses me though is how to enter the input for training. Using a subset of the kernel MxM (M being the number of training images), trains the SVM with M features. However, if I understood it correctly this limits me to use test-data with similar amounts of features. Trying to use sub-kernel of size MxN, causes infinite loops during training, consequently, using more features when testing gives poor results.

这将导致使用大小相等的训练和测试集，从而得出合理的结果.但是，如果我只想分类，比如说一张图像，或者为每个班级训练给定数量的图像，然后对其余图像进行测试，那根本就行不通.

This results in using equal sized training and test-sets giving reasonable results. But if i only would want to classify, say one image, or train with a given amount of images for each class and test with the rest, this doesn't work at all.

如何消除训练图像和特征之间的依赖关系，以便可以测试任意数量的图像?

How can i remove the dependency between number of training images and features, so i can test with any number of images?

我正在使用MATLAB的libsvm，内核是[0,1]之间的距离矩阵.

I'm using libsvm for MATLAB, the kernels are distance-matrices ranging between [0,1].

推荐答案

您似乎已经发现了问题……根据MATLAB软件包中包含的README文件:

You seem to already have figured out the problem... According to the README file included in the MATLAB package:

让我举例说明:

%# read dataset
[dataClass, data] = libsvmread('./heart_scale');

%# split into train/test datasets
trainData = data(1:150,:);
testData = data(151:270,:);
trainClass = dataClass(1:150,:);
testClass = dataClass(151:270,:);
numTrain = size(trainData,1);
numTest = size(testData,1);

%# radial basis function: exp(-gamma*|u-v|^2)
sigma = 2e-3;
rbfKernel = @(X,Y) exp(-sigma .* pdist2(X,Y,'euclidean').^2);

%# compute kernel matrices between every pairs of (train,train) and
%# (test,train) instances and include sample serial number as first column
K =  [ (1:numTrain)' , rbfKernel(trainData,trainData) ];
KK = [ (1:numTest)'  , rbfKernel(testData,trainData)  ];

%# train and test
model = svmtrain(trainClass, K, '-t 4');
[predClass, acc, decVals] = svmpredict(testClass, KK, model);

%# confusion matrix
C = confusionmat(testClass,predClass)

输出:

*
optimization finished, #iter = 70
nu = 0.933333
obj = -117.027620, rho = 0.183062
nSV = 140, nBSV = 140
Total nSV = 140
Accuracy = 85.8333% (103/120) (classification)

C =
    65     5
    12    38

这篇关于在libsvm中使用预计算的内核的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！