我想将weka j48树用于5交叉验证。到目前为止,这是我的代码,
public class WekaJvMain {
public static void main(String[] args) {
try
{
CSV2Arff converter =new CSV2Arff();
converter.convert();
DataSource source = new DataSource("data.arff");
Instances train = source.getDataSet();
train.setClassIndex(train.numAttributes() - 1); // setting class attribute
// classifier
J48 j48 = new J48();
j48.setUnpruned(true); // using an unpruned J48
j48.buildClassifier(train);
System.out.print(j48.graph());
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
此代码训练数据并打印出j48树。但是我找不到如何设置交叉验证的折叠次数?请详细解释,我不擅长Java。
最佳答案
这是对j48分类器进行5倍交叉验证评估后得到的代码。在训练最终分类器之前进行评估很重要。可以在here中找到其他信息。
public class WekaJvMain {
public static void main(String[] args) {
try
{
CSV2Arff converter =new CSV2Arff();
converter.convert();
DataSource source = new DataSource("data.arff");
Instances train = source.getDataSet();
train.setClassIndex(train.numAttributes() - 1); // setting class attribute
// classifier
J48 j48 = new J48();
j48.setUnpruned(true); // using an unpruned J48
//evaluate j48 with cross validation
Evaluation eval=new Evaluation(train);
//first supply the classifier
//then the training data
//number of folds
//random seed
eval.crossValidateModel(j48, train, 5, new Random(1));
System.out.println("Percent correct: "+
Double.toString(eval.pctCorrect()));
j48.buildClassifier(train);
System.out.print(j48.graph());
}
catch(Exception e)
{
e.printStackTrace();
}
}
}