问题描述
时间序列的输出看起来像一个数据框:
The output of a time-series looks like a data frame:
ts(rnorm(12*5, 17, 8), start=c(1981,1), frequency = 12)
Jan Feb Mar Apr May Jun Jul ...
1981 14.064085 21.664250 14.800249 -5.773095 16.477470 1.129674 16.747669 ...
1982 23.973620 17.851890 21.387944 28.451552 24.177141 25.212271 19.123179 ...
1983 19.801210 11.523906 8.103132 9.382778 4.614325 21.751529 9.540851 ...
1984 15.394517 21.021790 23.115453 12.685093 -2.209352 28.318686 10.159940 ...
1985 20.708447 13.095117 32.815273 9.393895 19.551045 24.847337 18.703991 ...
将其转换为包含 Jan、Feb、Mar... 列和 1981、1982... 行的数据框会很方便,然后再返回.最优雅的方法是什么?
It would be handy to transform it into a data frame with columns Jan, Feb, Mar... and rows 1981, 1982, ... and then back. What's the most elegant way to do this?
推荐答案
这里有两种方法.第一种方法为即将创建的矩阵创建dimnames,然后将数据串入矩阵,转置并将其转换为数据框.第二种方法创建一个由年份和月份变量组成的按列表,然后使用 tapply 将其转换为数据框并添加名称.
Here are two ways. The first way creates dimnames for the matrix about to be created and then strings out the data into a matrix, transposes it and converts it to data frame. The second way creates a by list consisting of year and month variables and uses tapply on that later converting to data frame and adding names.
# create test data
set.seed(123)
tt <- ts(rnorm(12*5, 17, 8), start=c(1981,1), frequency = 12)
1) 矩阵.这个解决方案要求我们有整整连续的年份
1) matrix. This solution requires that we have whole consecutive years
dmn <- list(month.abb, unique(floor(time(tt))))
as.data.frame(t(matrix(tt, 12, dimnames = dmn)))
如果我们不关心好听的名字,它只是 as.data.frame(t(matrix(tt, 12)))
.
If we don't care about the nice names it is just as.data.frame(t(matrix(tt, 12)))
.
我们可以使用@thelatemail 的注释将 dmn<-
行替换为以下更简单的行:
We could replace the dmn<-
line with the following simpler line using @thelatemail's comment:
dmn <- dimnames(.preformat.ts(tt))
2) 点按.使用 tapply
的更通用的解决方案如下:
2) tapply. A more general solution using tapply
is the following:
Month <- factor(cycle(tt), levels = 1:12, labels = month.abb)
tapply(tt, list(year = floor(time(tt)), month = Month), c)
注意:要反转这个假设 X
是上述任何解决方案.然后试试:
Note: To invert this suppose X
is any of the solutions above. Then try:
ts(c(t(X)), start = 1981, freq = 12)
更新
以下@latemail 的评论推动了改进.
Update
Improvement motivated by comments of @latemail below.
这篇关于将时间序列转换为数据框并返回的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!