本文介绍了调整"glm"中的调整参数.对比"rf"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用train中的method = "glm"建立分类模型.当我使用method = "rpart"时,它工作正常,但是当我切换到method = "glm"时,它给我一个错误提示

I am trying to build a classification model using method = "glm" in train.When I use method = "rpart" it works fine but when I switch to method = "glm" then it gives me an error saying

我尝试使用

cpGrid = data.frame(.0001)

cpGrid = data.frame(expand.grid(.cp = seq(.0001, .09, .001)))

但是都抛出错误.
以下是我的初始代码

But both throwing an error.
Below is my initial code

numFolds = trainControl(method = "cv", number = 10, repeats = 3)
cpGrid = expand.grid(.cp = seq(.0001, .09, .001))

工作正常

temp <-train(Churn. ~., data = train, method = 'rpart', trControl = numFolds, tuneGrid = cpGrid)

给出错误

treeCV <-train(Churn. ~., data = train, method = 'glm', trControl = numFolds, tuneGrid = data.frame(cpGrid))
predictCV = predict(treeCV, newdata = test, type = "prob")

从我的数据中

dput:

train <- structure(list(State = structure(c(17L, 32L, 36L, 37L, 20L, 25L
), .Label = c("AK", "AL", "AR", "AZ", "CA", "CO", "CT", "DC",
"DE", "FL", "GA", "HI", "IA", "ID", "IL", "IN", "KS", "KY", "LA",
"MA", "MD", "ME", "MI", "MN", "MO", "MS", "MT", "NC", "ND", "NE",
"NH", "NJ", "NM", "NV", "NY", "OH", "OK", "OR", "PA", "RI", "SC",
"SD", "TN", "TX", "UT", "VA", "VT", "WA", "WI", "WV", "WY"), class = "factor"),
    VMail.Message = c(25L, 0L, 0L, 0L, 24L, 0L), Day.Mins = c(265.1,
    243.4, 299.4, 166.7, 218.2, 157), Day.Calls = c(110L, 114L,
    71L, 113L, 88L, 79L), Eve.Charge = c(16.78, 10.3, 5.26, 12.61,
    29.62, 8.76), Night.Mins = c(244.7, 162.6, 196.9, 186.9,
    212.6, 211.8), Night.Calls = c(91L, 104L, 89L, 121L, 118L,
    96L), Intl.Mins = c(10, 12.2, 6.6, 10.1, 7.5, 7.1), CustServ.Calls = c(1L,
    0L, 2L, 3L, 3L, 0L), Churn. = structure(c(1L, 1L, 1L, 1L,
    1L, 1L), .Label = c("False.", "True."), class = "factor"),
    Area.Code = c(2, 2, 1, 2, 3, 2), Int.l.Plan = c(1, 1, 2,
    2, 1, 2), VMail.Plan = c(2, 1, 1, 1, 2, 1), Day.Charge = c(565,
    1005, 1571, 665, 1113, 580), Eve.Mins = c(690, 87, 1535,
    256, 1517, 9), Eve.Calls = c(120, 12, 109, 25, 10, 115),
    Night.Charge = c(101, 644, 797, 753, 866, 862), Intl.Calls = c(15,
    17, 19, 15, 19, 15), Intl.Charge = c(78, 100, 44, 79, 53,
    49)), .Names = c("State", "VMail.Message", "Day.Mins", "Day.Calls",
"Eve.Charge", "Night.Mins", "Night.Calls", "Intl.Mins", "CustServ.Calls",
"Churn.", "Area.Code", "Int.l.Plan", "VMail.Plan", "Day.Charge",
"Eve.Mins", "Eve.Calls", "Night.Charge", "Intl.Calls", "Intl.Charge"
), row.names = c(1L, 3L, 4L, 5L, 7L, 8L), class = "data.frame")

需要您的帮助才能在method = "glm"中使用cpGrid,也想知道我应该如何在其中包含ntree.我浏览了这里和那里提供的一些解决方案,但是似乎没有任何效果.

Need your help to use cpGrid in in method = "glm" Also want to know how should I include ntree in this . I explored through some of the solution provided here and there but nothing seems to work.

推荐答案

caret中的modelLookup命令提供与模型调整参数相关的信息.
对于rpart,只有一个调整参数可用,即cp复杂度参数.

The modelLookup command in caret gives information related to the tuning parameters for a model.
For rpart only one tuning parameter is available, the cp complexity parameter.

modelLookup("rpart")

#################
  model parameter                label forReg forClass probModel
1 rpart        cp Complexity Parameter   TRUE     TRUE      TRUE

glm的调整参数是parameter(我不知道它的用途):

The tuning parameter for glm is parameter (I don't' know what it is for):

modelLookup("glm")

#################
  model parameter     label forReg forClass probModel
1   glm parameter parameter   TRUE     TRUE      TRUE

因此,glmtuneGrid需要一个名为.parameter的列:

Hence, tuneGrid for glm needs a column named .parameter:

glmGrid = expand.grid(.parameter = seq(1, 10, 1))
glmCV <- train(Churn. ~., data = train, method = 'glm', trControl = numFolds,
      tuneGrid = data.frame(glmGrid))

predictCV = predict(glmCV, newdata = test, type = "prob")

这篇关于调整"glm"中的调整参数.对比"rf"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 18:57