我正在尝试使用Java中的LIBSVM的epsilon-SVR来预测单变量时间序列(我的数据由两列,时间戳和一个数值组成)。

当我不使用功能时,仅将数组索引视为功能(我知道这是不可信的),它总是返回相同的值。如果我使用滑动窗口,即预测时间t的值的特征是时间t-1,t-2,...,t-sliding_window的值,它将始终返回NaN。

我按照上述步骤训练模型:

public svm_model train(double[] series, int svmType, int kernelType, int degree, double gamma, double coef0, double C, double eps, double p, int shrinking, int nFeatures)
{
    series = normalize(series)
    svm_parameter params = new svm_parameter();
    svm_problem problem = new svm_problem();
    svm_node node = null;
    //----------Set parameters----------
    params.svm_type  = svmType;
    params.kernel_type = kernelType;
    params.degree = degree;
    params.gamma = 1/nFeatures;
    params.coef0 = coef0;
    params.C = C;
    params.eps = eps;
    params.cache_size=100;
    params.p = p;
    params.shrinking= shrinking;
    //----------Define problem----------
    problem.l = series.length;
    problem.y = series;
    problem.x = new svm_node[series.length][];
    for(int i=0;i<series.length;i++)
    {
       problem.x[i] = new svm_node[1];
       node = new svm_node();
       node.index = 0;
       node.value = i;
       problem.x[i][0] = node;
     }
    //----------Generate model----------
    svm_model svm_model = svm.svm_train(problem,params);
    return svm_model;
}



public svm_model trainSlidingWindow(double[] series, int svmType, int kernelType, int degree, double gamma, double coef0, double C, double eps, double p, int shrinking, int nFeatures, int slidingWindow)
{
    series = normalize(series)
    svm_parameter params = new svm_parameter();
    svm_problem problem = new svm_problem();
    svm_node node = null;
    //----------Set parameters----------
    params.svm_type  = svmType;
    params.kernel_type = kernelType;
    params.degree = degree;
    params.gamma = 1/nFeatures;
    params.coef0 = coef0;
    params.C = c;
    params.eps = eps;
    params.cache_size=100;
    params.p=p;
    params.shrinking= shrinking;
    //----------Define problem----------
    problem.l = series.length;
    problem.y = series;
    problem.x = new svm_node[series.length][slidingWindow];
    for(int i=0;i<series.length;i++)
    {
       problem.x[i] = new svm_node[slidingWindow];
       for(int j=0; j<slidingWindow;j++)
       {
          node = new svm_node();
          node.index = slidingWindow-(j+1);
          if(i-(j+1) <0)
             node.value = Double.NaN;
          else
             node.value = series[i-(j+1)];
             problem.x[i][j] = node;
       }
    }
   //----------Generate model----------
   svm_model svm_model = svm.svm_train(problem,params);
   return svm_model;
}


获得的预测如下:

public double[] predict(double[] series, svm_model model, int steps)
{
    series = normalize(series);
    double[] yPred = new double[steps];
    for(int i=0;i<steps;i++)
    {
        svm_node[] nodes = new svm_node[1];
        svm_node node = new svm_node();
        node.index = 0;
        node.value = series.length + i;
        nodes[0] = node;
        yPred[i] = svm.svm_predict(model,nodes);
    }
    return denormalize(yPred);
}

public double[] predictSlidingWindow(double[] series, svm_model model, int steps, int slidingWindow)
{
    series = normalize(series);
    double[] yPred = new double[steps];
    double[] aux = new double[slidingWindow+steps];
    System.arraycopy(series,series.length-slidingWindow,aux,0, slidingWindow);
    for(int i=0;i<steps;i++)
    {
        svm_node[] nodes = new svm_node[slidingWindow];
        for(int j=0;j<slidingWindow;j++)
        {
            svm_node node = new svm_node();
            node.index = slidingWindow-(j+1);
            node.value = aux[i+j];
            nodes[j] = node;
        }
        yPred[i] = svm.svm_predict(model,nodes);
        aux[slidingWindow+i] = yPred[i];
    }
    return denormalize(yPred);
}


我究竟做错了什么 ?
提前致谢。

最佳答案

显然,将数据标准化并将gamma参数的值更改为1可解决此问题。

当数据域太大时,这是构建支持向量回归模型时进行规范化的一种好方法,从而改善了预测的质量和执行时间。

07-27 19:32