java - 随机梯度下降的变量的显式规范

我有一个二进制分类问题，确定与特定文档关联的类别，这些文档呈现为以下形式的单词袋样式特征向量:

例:

Document 1 = ["I", "am", "awesome"]
Document 2 = ["I", "am", "great", "great"]

字典是:

["I", "am", "awesome", "great"]

因此，文档作为矢量将如下所示:

Document 1 = [1, 1, 1, 0]
Document 2 = [1, 1, 0, 2]

我想将此输入应用随机梯度下降算法以“最小化涉及hinge loss的经验风险”。

我在上下搜索，以查看随机梯度下降算法将如何接受这种形式的输入，但是在任何地方都找不到简单明了的解释。

这是维基百科的伪代码:

Choose an initial vector of parameters w and learning rate \alpha.
    Randomly shuffle examples in the training set.
        Repeat until an approximate minimum is obtained:
            For i=1, 2, ..., n, do:
                w := w - alpha DELTA Q_i(w)

有人可以向我解释我正在使用的输入如何适合该伪代码吗？

我看过这样表示的数据:

private List<Point2D> loadData()
{
    List<Point2D> data = new ArrayList<>();
    data.add(new Point2D.Double(1, 2));
    data.add(new Point2D.Double(2, 3));
    data.add(new Point2D.Double(3, 4));
    data.add(new Point2D.Double(4, 5));
    data.add(new Point2D.Double(5, 6));
    data.add(new Point2D.Double(6, 7));
    return data;
}

也像这样:

 static double[] x = {2, 4, 6, 8};
 static double[] y = {2, 5, 5, 8};

我想以后比较适合我的情况。

这是一个感知器实现，我想对其进行修改以产生随机梯度下降，也许有人可以指出我需要在哪里进行这些更改，以及如何进行？

public static void perceptron(Set<String> globoDict,
   Map<String, int[]> trainingPerceptronInput,
   Map<String, int[]> testPerceptronInput)
{
    //store weights to be averaged.
   Map<Integer,double[]> cached_weights = new HashMap<Integer,double[]>();


   final int globoDictSize = globoDict.size(); // number of features

   // weights total 32 (31 for input variables and one for bias)
   double[] weights = new double[globoDictSize + 1];
   for (int i = 0; i < weights.length; i++)
   {
       weights[i] = 0.0;
   }


   int inputSize = trainingPerceptronInput.size();
   double[] outputs = new double[inputSize];
   final double[][] a = Prcptrn_InitOutpt.initializeOutput(trainingPerceptronInput, globoDictSize, outputs, LABEL);


   double globalError;
   int iteration = 0;
   do
   {
       iteration++;
       globalError = 0;
       // loop through all instances (complete one epoch)
       for (int p = 0; p < inputSize; p++)
       {
           // calculate predicted class
           double output = Prcptrn_CalcOutpt.calculateOutput(THETA, weights, a, p);
           // difference between predicted and actual class values
           //always either zero or one
           double localError = outputs[p] - output;

           int i;
           for (i = 0; i < a.length; i++)
           {
               weights[i] += LEARNING_RATE * localError * a[i][p];
           }
           weights[i] += LEARNING_RATE * localError;

           // summation of squared error (error value for all instances)
           globalError += localError * localError;
       }

       //store weights for averaging
       cached_weights.put( iteration , weights );

       /* Root Mean Squared Error */
       System.out.println("Iteration " + iteration + " : RMSE = " + Math.sqrt(globalError / inputSize));
   }
   while (globalError != 0 && iteration <= MAX_ITER);



   int size = globoDictSize + 1;
   //compute averages
   double[] sums = new double[size];
   double[] averages = new double[size];

   for (Entry<Integer, double[]> entry : cached_weights.entrySet())
   {
       double[] value = entry.getValue();
       for(int pos=0; pos < size; pos++){
           sums[ pos ] +=  value[ pos ];
       }
   }
   for(int pos=0; pos < size; pos++){
       averages[ pos ] = sums[ pos ] / cached_weights.size();
   }


   System.out.println("\n=======\nDecision boundary equation:");
   int i;
   for (i = 0; i < a.length; i++)
   {
       System.out.print(" a");
       if (i < 10) System.out.print(0);
       System.out.println( i + " * " + weights[i] + " + " );


   }
   System.out.println(" bias: " + weights[i]);


   //TEST
   //this works because, at this point the weights have already been learned.
   inputSize = testPerceptronInput.size();
   outputs = new double[inputSize];
   double[][] z = Prcptrn_InitOutpt.initializeOutput(testPerceptronInput, globoDictSize, outputs, LABEL);

   test_output = Prcptrn_CalcOutpt.calculateOutput(THETA, weights, z, TEST_CLASS);

   System.out.println("class = " + test_output);

}

最佳答案

您需要使用权重乘以插入到所选损失函数中的数据表示形式来编写表达式。这就是你写Q的方式。
您正是在此表达式中使用数据。我认为您的表示方式没有问题，因为在初始化w之后，您将对其进行调整以计算出不错的决策函数。