问题描述
我知道有几个相关的问题,但是我似乎在这里的某个地方绊脚石.我尽我所能遵循此线程:内插时间序列,但收到错误消息(见下文):
I recognize there are several related questions, but I seem to be stumbling somewhere here. I followed this thread as best I could: Interpolating timeseries, but get error messages (see below) :
我的数据集包含每天每四个小时收集一次的样本.我想将这些数据插值成小时值.下面是我更大的数据集的子样本:
My dataset contains samples collected every four hours everyday. I would like to interpolate these data into hourly values. Below is a subsample of my much larger dataset:
vis <- structure(list(datetime = structure(1:24, .Label = c("2002-05-01-00",
"2002-05-01-06", "2002-05-01-12", "2002-05-01-18", "2002-05-02-00",
"2002-05-02-06", "2002-05-02-12", "2002-05-02-18", "2002-05-03-00",
"2002-05-03-06", "2002-05-03-12", "2002-05-03-18", "2002-05-04-00",
"2002-05-04-06", "2002-05-04-12", "2002-05-04-18", "2002-05-05-00",
"2002-05-05-06", "2002-05-05-12", "2002-05-05-18", "2002-05-06-00",
"2002-05-06-06", "2002-05-06-12", "2002-05-06-18"), class = "factor"),
VIStot = c(0L, 128L, 359L, 160L, 1L, 121L, 316L, 162L, 1L,
132L, 339L, 163L, 2L, 137L, 364L, 155L, 3L, 122L, 345L, 179L,
3L, 125L, 147L, 77L)), .Names = c("datetime", "VIStot"), class = "data.frame", row.names = c(NA,
-24L))
我要插补为小时分辨率的代码如下:
My code to interpolate to hourly resolution is as follows:
vis[, c(2)] <- sapply(vis[, c(2)], as.numeric)
library(zoo)
vis$datetime <- as.POSIXct(vis$datetime, format="%Y-%m-%d-%H")
hr <- zoo(vis$VIStot, vis$datetime)
int <- na.spline(hr$VIStot)
这以错误消息结尾
我不能正确格式化日期时间吗?为什么 hr
不能同时读取 VIStot
和 datetime
?
Am I not formatting the datetime correctly? Why is hr
not reading both VIStot
and datetime
?
此外,一旦插值,我想以.csv文件格式导出值.
Also, once interpolated, I would like to export the values in a .csv file format.
推荐答案
对此有两个想法.首先,函数 na.spline
想要在 vis $ VIStot
中插入 NA
值,但没有.因此,也许您的第一个问题是您没有生成一个可以正常运行该函数的序列.
Two thoughts on this. First, the function na.spline
is wanting to impute NA
values in vis$VIStot
, of which there are none. So perhaps your first issue is that you are not generating a proper sequence on which the function can operate.
第二,如果您正在寻找简单的插值,那么如何做:
Second, if you are looking for simple interpolation, then how about:
## using your "vis" above
newdt <- seq.POSIXt(vis$datetime[1], tail(vis$datetime, n=1), by='1 hour')
data.frame(datetime=newdt, VIStot=approx(vis$datetime, vis$VIStot, newdt)$y)
## datetime VIStot
## 1 2002-05-01 00:00:00 0.00000
## 2 2002-05-01 01:00:00 21.33333
## 3 2002-05-01 02:00:00 42.66667
## 4 2002-05-01 03:00:00 64.00000
## 5 2002-05-01 04:00:00 85.33333
## 6 2002-05-01 05:00:00 106.66667
我知道这在某种程度上可以解决,但是您可以从这里轻松地隐藏到 zoo
对象中.
I recognize this is somewhat of a workaround, but you can covert into your zoo
object easily from here.
我可以使用的另一种方法:
Another way I got it to work:
library(zoo)
vis2 <- merge(vis, data.frame(datetime=newdt), by.x='datetime', all.y=TRUE)
head(vis2, n=8)
## datetime VIStot
## 1 2002-05-01 00:00:00 0
## 2 2002-05-01 01:00:00 NA
## 3 2002-05-01 02:00:00 NA
## 4 2002-05-01 03:00:00 NA
## 5 2002-05-01 04:00:00 NA
## 6 2002-05-01 05:00:00 NA
## 7 2002-05-01 06:00:00 128
## 8 2002-05-01 07:00:00 NA
hr2 <- zoo(vis2$VIStot, vis2$datetime)
head(hr2, n=8)
## 2002-05-01 00:00:00 2002-05-01 01:00:00 2002-05-01 02:00:00
## 0 NA NA
## 2002-05-01 03:00:00 2002-05-01 04:00:00 2002-05-01 05:00:00
## NA NA NA
## 2002-05-01 06:00:00 2002-05-01 07:00:00
## 128 NA
Voici:
head(na.spline(hr2), n=8)
## 2002-05-01 00:00:00 2002-05-01 01:00:00 2002-05-01 02:00:00
## 0.000000 -12.533246 -8.229736
## 2002-05-01 03:00:00 2002-05-01 04:00:00 2002-05-01 05:00:00
## 10.442935 41.017177 81.025396
## 2002-05-01 06:00:00 2002-05-01 07:00:00
## 128.000000 179.200449
无论您需要插值,样条曲线还是其他方法,也许这都会使您朝着正确的方向前进.
Whether you need interpolation, a spline, or something else, perhaps this will get you moving in the right direction.
这篇关于R中的插值:检索小时值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!