r - 在 R adabag 中设置装袋

我似乎无法让 adabag 的 bagging 和 predict.bagging 工作。

在 predict.bagging 手册页中，有以下内容:

 library(adabag)
 library(rpart)
 data(iris)
 names(iris)<-c("LS","AS","LP","AP","Especies")
 sub <- c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25))
 iris.bagging <- bagging(Especies ~ ., data=iris[sub,], mfinal=10)
 iris.predbagging<- predict.bagging(iris.bagging, newdata=iris[-sub,])
 iris.predbagging

这很好，而且工作正常。但是，当我稍微更改 newdata 中的 predict.bagging 时，它停止工作。

主要是，我无法真正删除或更改 Especies 列，这很奇怪，因为那是我应该预测的!一个例子。

testdata <- iris[-sub, ]
result <- predict.bagging(iris.bagging, newdata=testdata)

....这工作正常，几乎是该示例的副本。但是，这会产生错误

testdata <- iris[-sub, -5] #this deletes the Especies column!
result <- predict.bagging(iris.bagging, newdata=testdata)

还有这个

testdata <- iris[-sub, ]
testdata$Especies <- c("virginica") #sets up everything as virginica
result <- predict.bagging(iris.bagging, newdata=testdata)

产生错误!

到底是怎么回事？我想使用 bagging 制作一个分类器，但我无法提前知道结果，这违背了这一点。

编辑:好吧，它甚至变得奇怪了。

> testdata <- iris[150,]
> predict.bagging(iris.bagging, newdata=testdata) #all working
> testdata
     LS AS  LP  AP  Especies
150 5.9  3 5.1 1.8 virginica
> is(testdata)
[1] "data.frame" "list"       "oldClass"   "vector"
> testdata$Especies = "virginica"
> testdata
     LS AS  LP  AP  Especies
150 5.9  3 5.1 1.8 virginica    #!!!the same thing!!!
> is(testdata)
[1] "data.frame" "list"       "oldClass"   "vector"    #the same object type!!!
>
> predict.bagging(iris.bagging, newdata = testdata)
Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr,  :
  length of 'dimnames' [2] not equal to array extent
In addition: Warning messages:
1: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
2: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
3: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
4: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
5: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
6: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
7: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
8: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
9: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
10: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL'
>

最佳答案

哦，我明白了，一点点。

显然，在最后一个 Especies 列中，有是一个因子，而不是一个字符串向量。所以，为了改变它，我必须像这样分解它:
testdata$Especies <- factor(c("virginica"), levels=c("setosa", "versicolor", "virginica"))
如果我有一个没有最后一列的数据框，我无论如何都必须添加它，并且因子的级别必须与原始表的因子完全相同，实际内容无关紧要。

到目前为止，我不接受我的答案是最好的，因为有人可以更好地解释原因。

关于r - 在 R adabag 中设置装袋，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/8983060/

Bagging

r - 在 R adabag 中设置装袋