我有以下xts对象。
x <- structure(c(30440.5, 30441, 30441.5, 30441.5, 30441, 30439.5, 30440.5, 30441,
30441.5, NA, NA, 30439.5, NA, NA, NA, 30441.5, 30441, NA), .indexTZ = "",
class = c("xts", "zoo"), .indexCLASS = c("POSIXct", "POSIXt"),
tclass = c("POSIXct", "POSIXt"), tzone = "",
index = structure(c(1519866931.1185, 1519866931.1255, 1519866931.1255,
1519866931.1905, 1519866931.1905, 1519866931.1915),
tzone = "", tclass = c("POSIXct", "POSIXt")),
.indexFormat = "%Y-%m-%d %H:%M:%OS",
.Dim = c(6L, 3L), .Dimnames = list(NULL, c("x", "y", "z")))
# x y z
# 2018-03-01 09:15:31.118 30440.5 30440.5 NA
# 2018-03-01 09:15:31.125 30441.0 30441.0 NA
# 2018-03-01 09:15:31.125 30441.5 30441.5 NA
# 2018-03-01 09:15:31.190 30441.5 NA 30441.5
# 2018-03-01 09:15:31.190 30441.0 NA 30441.0
# 2018-03-01 09:15:31.191 30439.5 30439.5 NA
如何编写
vapply
以获取mean(..., na.rm = TRUE)
跨行的均值,以使其返回这样的单列? w
2018-03-01 09:15:31.118 30440.5
2018-03-01 09:15:31.125 30441.0
2018-03-01 09:15:31.125 30441.5
2018-03-01 09:15:31.190 30441.5
2018-03-01 09:15:31.190 30441.0
2018-03-01 09:15:31.191 30439.5
我只是无法正常工作。
我注意到许多答案都建议我不要使用
vapply
而是使用其他功能。但是,根据此answer,vapply
实际上是最快的。那么哪个apply
功能在这里最好呢? 最佳答案
如果您希望每一行的列均值,则不会使用vapply
。我将使用rowMeans
,并注意您必须将结果转换回xts。
(xmean <- xts(rowMeans(x, na.rm = TRUE), index(x)))
# [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5
而且我会将
apply
用于没有专门实现的通用函数。请注意,如果函数返回多个值,则需要转置结果。(xmin <- as.xts(apply(x, 1, min, na.rm = TRUE), dateFormat = "POSIXct"))
# [,1]
# 2018-02-28 19:15:31 30440.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.5
# 2018-02-28 19:15:31 30441.0
# 2018-02-28 19:15:31 30439.5
(xrange <- as.xts(t(apply(x, 1, range, na.rm = TRUE)), dateFormat = "POSIXct"))
# [,1] [,2]
# 2018-02-28 19:15:31 30440.5 30440.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.5 30441.5
# 2018-02-28 19:15:31 30441.0 30441.0
# 2018-02-28 19:15:31 30439.5 30439.5
为了解决“为什么不使用
vapply()
”的注释,这里有一些基准(使用OP链接到的代码审查Q / A中的数据):set.seed(21)
xz <- xts(replicate(6, sample(c(1:100), 1000, rep = TRUE)),
order.by = Sys.Date() + 1:1000)
xrowmean <- function(x) { xts(rowMeans(x, na.rm = TRUE), index(x)) }
xapply <- function(x) { as.xts(apply(x, 1, mean, na.rm = TRUE), dateFormat = "POSIXct") }
xvapply <- function(x) { xts(vapply(seq_len(nrow(x)), function(i) {
mean(x[i,], na.rm = TRUE) }, FUN.VALUE = numeric(1)), index(x)) }
library(microbenchmark)
microbenchmark(xrowmean(xz), xapply(xz), xvapply(xz))
# Unit: microseconds
# expr min lq mean median uq max neval
# xrowmean(xz) 169.496 188.8505 207.1931 204.2455 219.4945 285.329 100
# xapply(xz) 33477.542 34203.3260 35698.0503 35076.4655 36821.1320 43910.353 100
# xvapply(xz) 32709.238 35010.1920 37514.7557 35884.3585 37972.7085 84409.961 100
那么,为什么不使用
vapply()
?它并不会增加性能优势。它比apply()
版本更为冗长,并且尚不清楚如果您可以控制对象的类型和所调用的函数,则“预先指定的返回值”的安全性会带来很多好处。也就是说,使用vapply()
不会对您造成任何伤害。对于这种情况,我只是更喜欢apply()
。