问题描述
我有一个从 csv 文件中读取的数据框,该文件具有每日观察结果:
I have a data frame I read from a csv file that has daily observations:
Date Value
2010-01-04 23.4
2010-01-05 12.7
2010-01-04 20.1
2010-01-07 18.2
问题:缺少数据.Forecast 包需要一个不包含任何缺失数据的普通 ts
对象,而我的数据集在大多数周末和其他随机点都有缺失数据.
PROBLEM: Missing data.Forecast package expects a plain ts
object not containing any missing data, while my dataset has missing data on most weekends and other random points.
转换为 ts
应该不起作用
converting to ts
should not work
ts(values, start = c(1997, 1), frequency = 1)
我能想到的唯一解决方案是将每日数据转换为每周数据,但 R 是一个新事物,可能存在其他更好的解决方案.
the only solution I can think of is to transform daily data to weekly data but R is a new thing and other better solutions could exist.
推荐答案
一种选择是扩展您的日期索引以包括缺失的观察结果,并使用 zoo
中的 na.approx
code> 通过插值填充缺失值.
One option is to expand your date index to include the missing observations, and use na.approx
from zoo
to fill in the missing values via interpolation.
allDates <- seq.Date(
min(values$Date),
max(values$Date),
"day")
##
allValues <- merge(
x=data.frame(Date=allDates),
y=values,
all.x=TRUE)
R> head(allValues,7)
Date Value
1 2010-01-05 -0.6041787
2 2010-01-06 0.2274668
3 2010-01-07 -1.2751761
4 2010-01-08 -0.8696818
5 2010-01-09 NA
6 2010-01-10 NA
7 2010-01-11 -0.3486378
##
zooValues <- zoo(allValues$Value,allValues$Date)
R> head(zooValues,7)
2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 2010-01-11
-0.6041787 0.2274668 -1.2751761 -0.8696818 NA NA -0.3486378
##
approxValues <- na.approx(zooValues)
R> head(approxValues,7)
2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 2010-01-11
-0.6041787 0.2274668 -1.2751761 -0.8696818 -0.6960005 -0.5223192 -0.3486378
即使缺少值,zooValues
仍然是一个合法的 zoo
对象,例如plot(zooValues)
将起作用(在缺失值处存在不连续性),但如果您计划将某种模型拟合到数据中,您很可能最好使用 na.approx
替换缺失值.
Even with missing values, zooValues
is still a legitimate zoo
object, e.g. plot(zooValues)
will work (with discontinuities at missing values), but if you plan on fitting some sort of model to the data, you will most likely be better off using na.approx
to replace the missing values.
数据:
library(zoo)
library(lubridate)
##
t0 <- "2010-01-04"
Dates <- as.Date(ymd(t0))+1:120
weekDays <- Dates[!(weekdays(Dates) %in% c("Saturday","Sunday"))]
##
set.seed(123)
values <- data.frame(Date=weekDays,Value=rnorm(length(weekDays)))
这篇关于R ts 缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!