问题描述
我已经使用rpart为数据集构建了决策树.
I have constructed a decision tree using rpart for a dataset.
然后,我将数据分为两部分-训练数据集和测试数据集.使用训练数据为数据集构建了一棵树.我想根据所创建的模型来计算预测的准确性.
I have then divided the data into 2 parts - a training dataset and a test dataset. A tree has been constructed for the dataset using the training data. I want to calculate the accuracy of the predictions based on the model that was created.
我的代码如下所示:
library(rpart)
#reading the data
data = read.table("source")
names(data) <- c("a", "b", "c", "d", "class")
#generating test and train data - Data selected randomly with a 80/20 split
trainIndex <- sample(1:nrow(x), 0.8 * nrow(x))
train <- data[trainIndex,]
test <- data[-trainIndex,]
#tree construction based on information gain
tree = rpart(class ~ a + b + c + d, data = train, method = 'class', parms = list(split = "information"))
我现在想通过将结果与实际值训练和测试数据进行比较来计算模型生成的预测的准确性,但是这样做时我遇到了错误.
I now want to calculate the accuracy of the predictions generated by the model by comparing the results with the actual values train and test data however I am facing an error while doing so.
我的代码如下所示:
t_pred = predict(tree,test,type="class")
t = test['class']
accuracy = sum(t_pred == t)/length(t)
print(accuracy)
我收到一条错误消息,指出-
I get an error message that states -
在检查t_pred的类型时,我发现它是整数类型,但是在文档中
On checking the type of t_pred, I found out that it is of type integer however the documentation
( https://stat.ethz.ch/R-manual/R-devel/library/rpart/html/predict.rpart.html )
指出predict()
方法必须返回向量.
states that the predict()
method must return a vector.
我无法理解为什么变量的类型是整数而不是列表.我在哪里犯了错误,该如何解决?
I am unable to understand why is the type of the variable is an integer and not a list. Where have I made the mistake and how can I fix it?
推荐答案
尝试首先计算混淆矩阵:
Try calculating the confusion matrix first:
confMat <- table(test$class,t_pred)
现在,您可以通过将矩阵的对角线总和(即正确的预测)除以矩阵的总和来计算精度:
Now you can calculate the accuracy by dividing the sum diagonal of the matrix - which are the correct predictions - by the total sum of the matrix:
accuracy <- sum(diag(confMat))/sum(confMat)
这篇关于使用rpart的预测方法(R编程)计算树的预测精度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!