本文介绍了插入缺少日期/时间的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是R的新手,但已经开始使用它来解决我要处理的大型数据集的问题.目前,我有4列数据(Y值)针对分钟间隔时间戳(月/日/年小时:分钟)(X值)设置,如下所示:

I am new to R but have turned to it to solve a problem with a large data set I am trying to process. Currently I have a 4 columns of data (Y values) set against minute-interval timestamps (month/day/year hour:min) (X values) as below:

    timestamp          tr            tt         sr         st
1   9/1/01 0:00   1.018269e+02   -312.8622   -1959.393   4959.828
2   9/1/01 0:01   1.023567e+02   -313.0002   -1957.755   4958.935
3   9/1/01 0:02   1.018857e+02   -313.9406   -1956.799   4959.938
4   9/1/01 0:03   1.025463e+02   -310.9261   -1957.347   4961.095
5   9/1/01 0:04   1.010228e+02   -311.5469   -1957.786   4959.078

我的问题是缺少一些时间戳值-例如在9/1/01 0:13和9/1/01 0:27之间可能存在间隙,并且这种间隙在数据集中是不规则的.我需要将这些序列中的几个序列放入同一个数据库,并且由于每个序列的缺失值不同,因此当前在每一行上的日期均未对齐.

The problem I have is that some timestamp values are missing - e.g. there may be a gap between 9/1/01 0:13 and 9/1/01 0:27 and such gaps are irregular through the data set. I need to put several of these series into the same database and because the missing values are different for each series, the dates do not currently align on each row.

我想为这些缺少的时间戳生成行,并在Y列中填充空白值(无数据,不为零),这样我就有一个连续的时间序列.

I would like to generate rows for these missing timestamps and fill the Y columns with blank values (no data, not zero), so that I have a continuous time series.

老实说,我不太确定从哪里开始(在我继续学习之前,并没有真正使用过R!),但是任何帮助将不胜感激.到目前为止,我已经安装了chron和zoo,因为它们似乎很有用.

I'm honestly not quite sure where to start (not really used R before so learning as I go along!) but any help would be much appreciated. I have thus far installed chron and zoo, since it seems they might be useful.

谢谢!

推荐答案

我认为最简单的方法是如前所述先设置日期,然后转换为Zoo,然后再设置合并:

I think the easiest thing ist to set Date first as already described, convert to zoo, and then just set a merge:

df$timestamp<-as.POSIXct(df$timestamp,format="%m/%d/%y %H:%M")

df1.zoo<-zoo(df[,-1],df[,1]) #set date to Index

df2 <- merge(df1.zoo,zoo(,seq(start(df1.zoo),end(df1.zoo),by="min")), all=TRUE)

开始和结束是从df1(原始数据)中给出的,您可以根据示例需要设置-例如min-. all = TRUE将缺少日期的所有缺失值设置为NA.

Start and end are given from your df1 (original data) and you are setting by - e.g min - as you need for your example. all=TRUE sets all missing values at the missing dates to NAs.

这篇关于插入缺少日期/时间的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 18:11