问题描述
我知道这个问题已经被问过几次了,但是我找不到解决问题的方法.
I know this question has already been asked a couple of times, but I couldn't find a solution to my problem.
我的变量多于观察值,矩阵中没有NAN
值.这是我的功能:
I don't have more variables than observations and I don't have NAN
values in my matrix. Here's my function:
function [ind, idx_ran] = fselect(features_f, class_f, dir)
idx = linspace(1,size(features_f, 2), size(features_f, 2));
idx_ran = idx(:,randperm(size(features_f, 2)));
features_t_ran = features_f(:,idx_ran); % randomize colums
len = length(class_f);
r = randi(len, [1, round(len*0.15)]);
x = features_t_ran;
y = class_f;
xtrain = x;
ytrain = y;
xtrain(r,:) = [];
ytrain(r,:) = [];
xtest = x(r,:);
ytest = y(r,:);
f = @(xtrain, ytrain, xtest, ytest)(sum(~strcmp(ytest, classify(xtest, xtrain, ytrain))));
fs = sequentialfs(f, x, y, 'direction', dir);
ind = find(fs < 1);
end
这是我的测试和培训数据.
and here are my test and training data.
>> whos xtest
Name Size Bytes Class Attributes
xtest 524x42 176064 double
>> whos xtrain
Name Size Bytes Class Attributes
xtrain 3008x42 1010688 double
>> whos ytest
Name Size Bytes Class Attributes
ytest 524x1 32488 cell
>> whos ytrain
Name Size Bytes Class Attributes
ytrain 3008x1 186496 cell
>>
这是错误,
Error using crossval>evalFun (line 465)
The function
'@(xtrain,ytrain,xtest,ytest)(sum(~strcmp(ytest,classify(xtest,xtrain,ytrain))))' generated
the following error:
The pooled covariance matrix of TRAINING must be positive definite.
Error in crossval>getFuncVal (line 482)
funResult = evalFun(funorStr,arg(:));
Error in crossval (line 324)
funResult = getFuncVal(1, nData, cvp, data, funorStr, []);
Error in sequentialfs>callfun (line 485)
funResult = crossval(fun,x,other_data{:},...
Error in sequentialfs (line 353)
crit(k) = callfun(fun,x,other_data,cv,mcreps,ParOptions);
Error in fselect (line 26)
fs = sequentialfs(f, x, y, 'direction', dir);
Error in workflow_forward (line 31)
[ind, idx_ran] = fselect(features_f, class_f, 'forward');
昨天工作了. :/
推荐答案
如果检查函数classify
,则会发现该错误是在程序检查从训练矩阵的QR分解获得的矩阵R的条件数时生成的.换句话说,它对您提供的培训矩阵不满意.它发现该矩阵是病态的,因此任何解都将是不稳定的(该函数执行的矩阵求逆等效,这将导致病态的训练矩阵除以非常小的数字).
If you inspect function classify
you find that the error is generated when the program checks the condition number of the matrix R obtained from QR decomposition of your training matrix. In other words, it is unhappy with the training matrix you are providing. It finds that this matrix is ill-conditioned and therefore any solution would be unstable (the function performs the equivalent of a matrix inversion which would lead to the equivalent of division by a very small number for an ill-conditioned training matrix).
似乎通过缩小训练集的大小,稳定性降低了.我的建议是,如果可能的话,使用更大的训练集.
It seems that by shrinking the size of your training set the stability was reduced. My suggestion is to use a larger training set if possible.
修改
您可能想知道怎么可能有更多的观察结果而不是变量,并且仍然有病态的问题.答案是,不同的观察结果可以是彼此的线性组合.
You may be wondering how it is possible to have more observations than variables and still have an ill-conditioned problem. The answer is that different observations can be linear combinations of each other.
这篇关于TRAINING的合并协方差矩阵必须为正定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!