本文介绍了将时间序列数据拆分为时间间隔(比如一个小时),然后绘制计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只有一个包含一列时间序列的数据文件:

I just have a data file with one column of time series:

'2012-02-01 17:42:44'
'2012-02-01 17:42:44'
'2012-02-01 17:42:44'

...我想将数据拆分,以便在小时的顶部进行计数.说:

...I want to split the data up such that I have a count at the top of hour. Say:

'2012-02-01 17:00:00'  20   
'2012-02-01 18:00:00'  30  

20"和30"代表该出局期间的时间序列条目数.我希望能够绘制时间与计数"的关系图.我怎样才能用 R 做到这一点?

The '20' and '30' represent the number of time series entries for that out period. And I want to be able to graph the time vs that 'count'. How can I do this with R?

这是我当前的折线图.

library(ggplot2)

req <- read.table("times1.dat")
summary(req)

da <- req$V2
db <- req$V1

time <- as.POSIXct(db)

png('time_data_errs.png', width=800, height=600)
gg <- qplot(time, da) + geom_line()

print(gg)
dev.off()

推荐答案

听起来您想使用 cut 来计算一小时内出现了多少个值.

It sounds like you want to use cut to figure out how many values occur within an hour.

如果您能提供一些示例数据,通常会有所帮助.这里有一些:

It's generally helpful if you can provide some sample data. Here's some:

set.seed(1) # So you can get the same numbers as I do
MyDates <- ISOdatetime(2012, 1, 1, 0, 0, 0, tz = "GMT") + sample(1:27000, 500)
head(MyDates)
# [1] "2012-01-01 01:59:29 GMT" "2012-01-01 02:47:27 GMT" "2012-01-01 04:17:46 GMT"
# [4] "2012-01-01 06:48:39 GMT" "2012-01-01 01:30:45 GMT" "2012-01-01 06:44:13 GMT"

您可以使用 tablecut(带有参数 breaks="hour"(参见 ?cut.Date 了解更多信息)) 以查找每小时的频率.

You can use table and cut (with the argument breaks="hour" (see ?cut.Date for more info)) to find the frequencies per hour.

MyDatesTable <- table(cut(MyDates, breaks="hour"))
MyDatesTable
# 
# 2012-01-01 00:00:00 2012-01-01 01:00:00 2012-01-01 02:00:00 2012-01-01 03:00:00 
#                  59                  73                  74                  83 
# 2012-01-01 04:00:00 2012-01-01 05:00:00 2012-01-01 06:00:00 2012-01-01 07:00:00 
#                  52                  62                  64                  33 
# Or a data.frame if you prefer
data.frame(MyDatesTable)
#                  Var1 Freq
# 1 2012-01-01 00:00:00   59
# 2 2012-01-01 01:00:00   73
# 3 2012-01-01 02:00:00   74
# 4 2012-01-01 03:00:00   83
# 5 2012-01-01 04:00:00   52
# 6 2012-01-01 05:00:00   62
# 7 2012-01-01 06:00:00   64
# 8 2012-01-01 07:00:00   33

最后,这是 MyDatesTable 对象的线图:

Finally, here's a line plot of the MyDatesTable object:

plot(MyDatesTable, type="l", xlab="Time", ylab="Freq")

cut 可以处理一系列的时间间隔.例如,如果您想每 30 分钟制表,您可以轻松地调整 breaks 参数来处理:

cut can handle a range of time intervals. For example, if you wanted to tabulate for every 30 minutes, you can easily adapt the breaks argument to handle that:

data.frame(table(cut(MyDates, breaks = "30 mins")))
#                   Var1 Freq
# 1  2012-01-01 00:00:00   22
# 2  2012-01-01 00:30:00   37
# 3  2012-01-01 01:00:00   38
# 4  2012-01-01 01:30:00   35
# 5  2012-01-01 02:00:00   32
# 6  2012-01-01 02:30:00   42
# 7  2012-01-01 03:00:00   39
# 8  2012-01-01 03:30:00   44
# 9  2012-01-01 04:00:00   25
# 10 2012-01-01 04:30:00   27
# 11 2012-01-01 05:00:00   33
# 12 2012-01-01 05:30:00   29
# 13 2012-01-01 06:00:00   29
# 14 2012-01-01 06:30:00   35
# 15 2012-01-01 07:00:00   33

更新

由于您尝试使用 ggplot2 绘图,这里有一种方法(不确定它是否是最好的,因为我通常在需要时使用基本 R 的图形).


Update

Since you were trying to plot with ggplot2, here's one approach (not sure if it is the best since I usually use base R's graphics when I need to).

创建表的data.frame(如上所示)并添加一个虚拟的组"变量并绘制如下图:

Create a data.frame of the table (as demonstrated above) and add a dummy "group" variable and plot that as follows:

MyDatesDF <- data.frame(MyDatesTable, grp = 1)
ggplot(MyDatesDF, aes(Var1, Freq)) + geom_line(aes(group = grp))

这篇关于将时间序列数据拆分为时间间隔(比如一个小时),然后绘制计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-16 02:10