本文介绍了错误:在 Caret 中使用 Train 时 nrow(x) == n 不是 TRUE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的训练集

I have a training set that looks like

Name       Day         Area         X    Y    Month Night
ATTACK    Monday   LA           -122.41 37.78   8      0
VEHICLE  Saturday  CHICAGO      -1.67    3.15   2      0
MOUSE     Monday   TAIPEI       -12.5    3.1    9      1

Name 是结果/因变量.我将 NameAreaDay 转换为因子,但我不确定是否应该为 Month> 和 Night,它们分别只取整数值 1-12 和 0-1.

Name is the outcome/dependent variable. I converted Name, Area and Day into factors, but I wasn't sure if I was supposed to for Month and Night, which only take on integer values 1-12 and 0-1, respectively.

然后我将数据转换成矩阵

I then convert the data into matrix

ynn <- model.matrix(~Name , data = trainDF)
mnn <- model.matrix(~ Day+Area +X + Y + Month + Night, data = trainDF)

然后我设置调整参数

nnTrControl=trainControl(method = "repeatedcv",number = 3,repeats=5,verboseIter = TRUE, returnData = FALSE, returnResamp = "all", classProbs = TRUE, summaryFunction = multiClassSummary,allowParallel = TRUE)
nnGrid = expand.grid(.size=c(1,4,7),.decay=c(0,0.001,0.1))
model <- train(y=ynn, x=mnn, method='nnet',linout=TRUE, trace = FALSE, trControl = nnTrControl,metric="logLoss", tuneGrid=nnGrid)

但是,对于model<-train

如果我使用 xgboost 而不是 nnet

I also get a similar error if I use xgboost instead of nnet

有人知道这是什么原因吗?

Anyone know whats causing this?

推荐答案

y 应该是包含每个样本的结果的数值或因子向量,而不是矩阵.使用

y should be a numeric or factor vector containing the outcome for each sample, not a matrix. Using

train(y = make.names(trainDF$Name), ...)

helps,其中 make.names 修改值,以便它们可以是有效的变量名称.

helps, where make.names modifies values so that they could be valid variable names.

这篇关于错误:在 Caret 中使用 Train 时 nrow(x) == n 不是 TRUE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 12:23