缺失值的 R ts

本文介绍了缺失值的 R ts的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!


我有一个从 csv 文件中读取的数据框,其中包含每日观察结果:

I have a data frame I read from a csv file that has daily observations:

Date        Value
2010-01-04  23.4
2010-01-05  12.7
2010-01-04  20.1
2010-01-07  18.2

问题:缺少数据.预测包需要一个不包含任何缺失数据的普通 ts 对象,而我的数据集在大多数周末和其他随机点都有缺失数据.

PROBLEM: Missing data.Forecast package expects a plain ts object not containing any missing data, while my dataset has missing data on most weekends and other random points.

转换为 ts 应该不起作用

converting to ts should not work

ts(values, start = c(1997, 1), frequency = 1)

我能想到的唯一解决方案是将每日数据转换为每周数据,但 R 是新事物,可能存在其他更好的解决方案.

the only solution I can think of is to transform daily data to weekly data but R is a new thing and other better solutions could exist.


一种选择是扩展您的日期索引以包含缺失的观察结果,并使用 zoo 中的 na.approxcode> 通过插值填充缺失值.

One option is to expand your date index to include the missing observations, and use na.approx from zoo to fill in the missing values via interpolation.

allDates <- seq.Date(
allValues <- merge(
R> head(allValues,7)
        Date      Value
1 2010-01-05 -0.6041787
2 2010-01-06  0.2274668
3 2010-01-07 -1.2751761
4 2010-01-08 -0.8696818
5 2010-01-09         NA
6 2010-01-10         NA
7 2010-01-11 -0.3486378
zooValues <- zoo(allValues$Value,allValues$Date)
R> head(zooValues,7)
2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 2010-01-11
-0.6041787  0.2274668 -1.2751761 -0.8696818         NA         NA -0.3486378
approxValues <- na.approx(zooValues)
R> head(approxValues,7)
2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 2010-01-11
-0.6041787  0.2274668 -1.2751761 -0.8696818 -0.6960005 -0.5223192 -0.3486378

即使缺少值,zooValues 仍然是合法的 zoo 对象,例如plot(zooValues) 会起作用(在缺失值处出现不连续性),但如果您计划将某种模型拟合到数据中,则最好使用 na.approx 替换缺失值.

Even with missing values, zooValues is still a legitimate zoo object, e.g. plot(zooValues) will work (with discontinuities at missing values), but if you plan on fitting some sort of model to the data, you will most likely be better off using na.approx to replace the missing values.


t0 <- "2010-01-04"
Dates <- as.Date(ymd(t0))+1:120
weekDays <- Dates[!(weekdays(Dates) %in% c("Saturday","Sunday"))]
values <- data.frame(Date=weekDays,Value=rnorm(length(weekDays)))

这篇关于缺失值的 R ts的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 10:40