问题描述
我有一个从周六开始到周三结束的每日时间序列.有一个明确的每周周期.它存储在 R 中的向量 a 中.所以,我尝试将其转换为时间序列对象 -
I have a daily time series that begins on Saturday and ends on Wednesday. There is a clear weekly period to it. It is stored in a vector a in R. So, I try and convert it into a time series object -
ts(a,frequency=7)
这给了我 -
Time Series:
Start = c(1, 1)
End = c(13, 5)
(1,1) 和 (13,5) 是什么意思?在这种情况下,指定开始和结束的最佳方式是什么.互联网上的所有示例都是针对年度数据,而不是每天.
What do the (1,1) and (13,5) mean? And what is the best way to specify start and end in this scenario. All the examples on the internet deal with yearly data, not daily.
推荐答案
让我们通过文档 (?ts
) 探索 ts
如何在不同频率下工作
Let's explore how ts
works with different frequencies using the documentation (?ts
)
假设这是您的数据
dat <- data.frame(myts = sample(10, 24, replace = T),
Date = seq(as.Date("2008-10-11"), as.Date("2008-10-11") + 23, by = 1))
# myts Date
# 1 6 2008-10-11
# 2 9 2008-10-12
# 3 6 2008-10-13
# 4 9 2008-10-14
# 5 8 2008-10-15
# 6 6 2008-10-16
# 7 1 2008-10-17
# 8 9 2008-10-18
# 9 3 2008-10-19
# 10 5 2008-10-20
# 11 7 2008-10-21
# 12 4 2008-10-22
# 13 2 2008-10-23
# 14 9 2008-10-24
# 15 5 2008-10-25
# 16 9 2008-10-26
# 17 7 2008-10-27
# 18 8 2008-10-28
# 19 7 2008-10-29
# 20 2 2008-10-30
# 21 6 2008-10-31
# 22 6 2008-11-01
# 23 8 2008-11-02
# 24 1 2008-11-03
让我们比较相同数据和任意起点上不同频率的输出
Let's compare outputs for different frequencies on same data and some arbitrary start point
print(ts(dat$myts, frequency = 7, start = c(1950, 3)), calendar = T)
# p1 p2 p3 p4 p5 p6 p7
# 1950 6 9 6 9 8
# 1951 6 1 9 3 5 7 4
# 1952 2 9 5 9 7 8 7
# 1953 2 6 6 8 1
print(ts(dat$myts, frequency = 12, start = c(1950, 3)), calendar = T)
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
# 1950 6 9 6 9 8 6 1 9 3 5
# 1951 7 4 2 9 5 9 7 8 7 2 6 6
# 1952 8 1
print(ts(dat$myts, frequency = 4, start = c(1950, 3)), calendar = T)
# Qtr1 Qtr2 Qtr3 Qtr4
# 1950 6 9
# 1951 6 9 8 6
# 1952 1 9 3 5
# 1953 7 4 2 9
# 1954 5 9 7 8
# 1955 7 2 6 6
# 1956 8 1
print(ts(dat$myts, frequency = 7), calendar = T)
# p1 p2 p3 p4 p5 p6 p7
# 1 6 9 6 9 8 6 1
# 2 9 3 5 7 4 2 9
# 3 5 9 7 8 7 2 6
# 4 6 8 1
我们可以从输出中学到 3 件事
We can learn 3 things from the outputs
1- ts
熟悉 12 和 4 频率并将它们识别为月份和季度,而它以不那么直接的方式打印 7 频率.
1- ts
is familiar with 12 and 4 frequencies and identifies them as months and quarters, while it's prints the 7 frequency in a not so straightforward way.
2- start
参数中的第一个数字是取决于频率的周期数,而第二个数字是该周期内的第一个事件(因为并非所有系列都从一月或周日).
2- The first number in the start
parameter is the number of the period depending on the frequency, while the second number is the first incident in that period (as not all series begin at January or at Sunday).
3- 当您不指定起点时,ts
函数假定您从第一个周期的开头开始(因此 (1,1)
在你的例子中)
3- When you are not specifying the start point, the ts
function assumes that you are starting from the beginning of the first period (thus the (1,1)
in your example)
现在,为了让这个时间序列对您更有意义,您可能会计算一年中的第几周(因为我们通常一年大约有 52 周)和第一次观察的天数(例如:1 = 星期日,2 = 星期一等)并将它们解析为 start
参数(参见 ?strftime
)
Now, in order to make this time series more meaningful for you, you could potentially compute the week number of the year (as we usually have about 52 weeks an a year) and the day number of your first observation (e.g.: 1 = Sunday, 2 = Monday, etc.) and parse them into the start
parameter (see ?strftime
)
startW <- as.numeric(strftime(head(dat$Date, 1), format = "%W"))
startD <- as.numeric(strftime(head(dat$Date, 1) + 1, format =" %w"))
print(ts(dat$myts, frequency = 7, start = c(startW, startD)), calendar = T)
# p1 p2 p3 p4 p5 p6 p7
#39 6
#40 9 6 9 8 6 1 9
#41 3 5 7 4 2 9 5
#42 9 7 8 7 2 6 6
#43 8 1
这意味着我们的第一次观察(发生在 2008 年 10 月 11 日)是 2008 年第 39 周的星期六
Which means that our first observation (which occurred in 2008-10-11) was Saturday of the 39th week of 2008
这篇关于带有ts的每日时间序列..如何指定开始和结束的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!