问题描述
我想使用我的训练数据和测试数据为我的逻辑回归计算两个混淆矩阵:
I want to calculate two confusion matrix for my logistic regression using my training data and my testing data:
logitMod <- glm(LoanStatus_B ~ ., data=train, family=binomial(link="logit"))
我将预测概率的阈值设置为0.5:
i set the threshold of predicted probability at 0.5:
confusionMatrix(table(predict(logitMod, type="response") >= 0.5,
train$LoanStatus_B == 1))
下面的代码对我的培训非常有效.但是,当我使用测试仪时:
And the the code below works well for my training set.However, when i use the test set:
confusionMatrix(table(predict(logitMod, type="response") >= 0.5,
test$LoanStatus_B == 1))
它给我一个
Error in table(predict(logitMod, type = "response") >= 0.5, test$LoanStatus_B == : all arguments must have the same length
这是为什么?我怎样才能解决这个问题?谢谢!
Why is this? How can I fix this? Thank you!
推荐答案
由于您忘记了提供新数据,因此我认为使用预测有问题.另外,您可以使用 caret
包中的函数 confusionMatrix
来计算和显示混淆矩阵,但是在调用之前无需列出结果.
I think there is a problem with the use of predict, since you forgot to provide the new data. Also, you can use the function confusionMatrix
from the caret
package to compute and display confusion matrices, but you don't need to table your results before that call.
在这里,我创建了一个包含代表性二进制目标变量的玩具数据集,然后我训练了一个与您所做的类似的模型.
Here, I created a toy dataset that includes a representative binary target variable and then I trained a model similar to what you did.
train <- data.frame(LoanStatus_B = as.numeric(rnorm(100)>0.5), b= rnorm(100), c = rnorm(100), d = rnorm(100))
logitMod <- glm(LoanStatus_B ~ ., data=train, family=binomial(link="logit"))
现在,您可以预测数据(例如,训练集),然后使用带有两个参数的 confusionMatrix()
:
- 您的预测
- 观察到的类别
- your predictions
- the observed classes
Now, you can predict the data (for example, your training set) and then use confusionMatrix()
that takes two arguments:
library(caret)
# Use your model to make predictions, in this example newdata = training set, but replace with your test set
pdata <- predict(logitMod, newdata = train, type = "response")
# use caret and compute a confusion matrix
confusionMatrix(data = as.numeric(pdata>0.5), reference = train$LoanStatus_B)
这是结果
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 66 33
1 0 1
Accuracy : 0.67
95% CI : (0.5688, 0.7608)
No Information Rate : 0.66
P-Value [Acc > NIR] : 0.4625
这篇关于用于R中Logistic回归的confusionMatrix的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!