我想将weka j48树用于5交叉验证。到目前为止,这是我的代码,

public class WekaJvMain {
    public static void main(String[] args) {
         try
         {
             CSV2Arff converter =new CSV2Arff();
             converter.convert();

             DataSource source = new DataSource("data.arff");
             Instances train = source.getDataSet();

             train.setClassIndex(train.numAttributes() - 1);  // setting class attribute

             // classifier
             J48 j48 = new J48();
             j48.setUnpruned(true);        // using an unpruned J48

             j48.buildClassifier(train);
             System.out.print(j48.graph());

         }
         catch(Exception e)
         {
             e.printStackTrace();
         }
    }
}


此代码训练数据并打印出j48树。但是我找不到如何设置交叉验证的折叠次数?请详细解释,我不擅长Java。

最佳答案

这是对j48分类器进行5倍交叉验证评估后得到的代码。在训练最终分类器之前进行评估很重要。可以在here中找到其他信息。

public class WekaJvMain {
    public static void main(String[] args) {
         try
         {
             CSV2Arff converter =new CSV2Arff();
             converter.convert();

             DataSource source = new DataSource("data.arff");
             Instances train = source.getDataSet();

             train.setClassIndex(train.numAttributes() - 1);  // setting class attribute

             // classifier
             J48 j48 = new J48();
             j48.setUnpruned(true);        // using an unpruned J48

             //evaluate j48 with cross validation
             Evaluation eval=new Evaluation(train);

             //first supply the classifier
             //then the training data
             //number of folds
             //random seed
             eval.crossValidateModel(j48, train, 5, new Random(1));
             System.out.println("Percent correct: "+
                                Double.toString(eval.pctCorrect()));


             j48.buildClassifier(train);
             System.out.print(j48.graph());

         }
         catch(Exception e)
         {
             e.printStackTrace();
         }
    }
} 

07-27 13:59