本文介绍了如何将H2OFrame对象的多个列强制转换为因子?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试遵循有关问题的建议:>将多列强制因素" ,但不适用于H2OFrame对象,例如:

I am trying to follow the suggestion on question: "Coerce multiple columns to factors at once", but it does not work for an H2OFrame object, for example:

data <- data.frame(matrix(sample(1:40), 4, 10, dimnames = list(1:4, LETTERS[1:10])))
data.hex <- as.h2o(data, destination_frame = "data.hex")
cols <- c("A", "C", "D", "H")
data.hex[cols] <- lapply(data.hex[cols], factor)

产生以下错误消息:

Error in `[<-.H2OFrame`(`*tmp*`, cols, value = list(1L, 1L, 1L, 1L, 1L,  : 
  `value` can only be an H2OFrame object or a numeric or character vector
In addition: 
Warning message:
In if (is.na(value)) value <- NA_integer_ else if (!is.numeric(value) &&  :


the condition has length > 1 and only the first element will be used

如果我尝试将因素一一强制,它会起作用.另一个解决方法是先强制data.frame作为因素,然后将其转换为H2OFrame对象,例如:

If I try to coerce as factor one by one, it works. Another workaround is to coerce as factor first the data.frame, then convert it into H2OFrame object, for example:

data[cols] <- lapply(data[cols], factor)
data.hex <- as.h2o(data, destination_frame = "data.hex")

有什么解释为什么会发生或有更好的解决方法?

Any explanation why it happens or any better workaround?

推荐答案

正确的方法是使用H2OFrame apply()函数,但是,这会产生与@MKR相同的错误.我已经在此处创建了一张JIRA票证.

The right way to do it is to use the H2OFrame apply() function, however, this produces the same error that @MKR mentioned. I have created a JIRA ticket here.

从理论上讲,这应该可行:

In theory, this should work:

data.hex[,cols] <- apply(X = data.hex[,cols], MARGIN = 2, FUN = as.factor)

目前,解决方法是:

for (col in cols) {
  data.hex[col] <- as.factor(data.hex[col])
}

这篇关于如何将H2OFrame对象的多个列强制转换为因子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-11 00:56