本文介绍了在R中为glm函数计算训练数据集的AUC的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用glm在我的逻辑回归模型的训练数据上找到AUC

I am trying to find AUC on a training data for my logistic regression model using glm

我将数据拆分为训练和测试集,使用glm拟合了logistic回归模型回归模型,计算了预测值并试图找到AUC

I split data to train and test set, fitted a logistic regression model regression model using glm, computed predicted value and trying to find AUC

d<-read.csv(file.choose(), header=T)
 set.seed(12345)
 train = runif(nrow(d))<.5
 table(train)
 fit = glm(y~ ., binomial, d)
 phat<-predict(fit,type = 'response')
 d$phat=phat
 g <- roc(y ~ phat, data = d, print.auc=T)
 plot(g)

推荐答案

另一个易于使用的选项是使用 caret 库,这使得在以下位置拟合和比较回归/分类模型非常简单R.以下示例代码使用 GermanCredit 数据集,通过Logistic回归模型预测信用度.该代码改编自以下博客: https://www.r-bloggers.com/evaluating-logistic-regression-models/.

Another user-friendly option is to use the caret library, which makes it pretty straightforward to fit and compare regression/classification models in R. The following example code uses the GermanCredit dataset to predict credit worthiness using a logistic regression model. The code is adapted from this blog: https://www.r-bloggers.com/evaluating-logistic-regression-models/.

library(caret)

## example from https://www.r-bloggers.com/evaluating-logistic-regression-models/
data(GermanCredit)

## 60% training / 40% test data
trainIndex <- createDataPartition(GermanCredit$Class, p = 0.6, list = FALSE)

GermanCreditTrain <- GermanCredit[trainIndex, ]
GermanCreditTest <- GermanCredit[-trainIndex, ]

## logistic regression based on 10-fold cross-validation 
trainControl <- trainControl(
     method = "cv",
     number = 10,
     classProbs = TRUE,
     summaryFunction = twoClassSummary
)

fit <- train(
    form = Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own + 
         CreditHistory.Critical,  
    data = GermanCreditTrain,
    trControl = trainControl,
    method = "glm", 
    family = "binomial", 
    metric = "ROC"
)

## AUC ROC for training data
print(fit)

## AUC ROC for test data
## See https://topepo.github.io/caret/measuring-performance.html#measures-for-class-probabilities
 predictTest <- data.frame(
         obs = GermanCreditTest$Class,                                    ## observed class labels
         predict(fit, newdata = GermanCreditTest, type = "prob"),         ## predicted class probabilities
         pred = predict(fit, newdata = GermanCreditTest, type = "raw")    ## predicted class labels
     ) 

twoClassSummary(data = predictTest, lev = levels(predictTest$obs))

这篇关于在R中为glm函数计算训练数据集的AUC的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-22 07:34