问题描述
'2012-02-01 17:42:我只有一列数据文件, 44'
'2012-02-01 17:42:44'
'2012-02-01 17:42:44'
...
我想分割数据,以便在小时的顶部计数。说:
'2012-02-01 17:00:00'20
'2012-02-01 18 :00:00'30
'20'和'30'代表时间序列条目的数量那段时间。我希望能够绘制时间与计数的关系。我怎样才能用R做到这一点?
这是我目前的折线图。
library(ggplot2)
req< - read.table(times1.dat)
summary(req)
da< - req $ V2
db< - req $ V1
time< - as.POSIXct(db)
png('time_data_errs.png',width = 800,height = 600)
gg< - qplot(time,da)+ geom_line()
print(gg)
dev.off()
听起来你想用 cut 来计算一小时内有多少个值。
如果您可以提供一些示例数据,这通常很有帮助。这里有一些:
set.seed(1)#所以你可以得到和我一样的数字
MyDates< ; - ISOdatetime(2012,1,1,0,0,0,tz =GMT)+ sample(1:27000,500)
head(MyDates)
#[1]2012- 01-01 01:59:29 GMT2012-01-01 02:47:27 GMT2012-01-01 04:17:46 GMT
#[4]2012-01-01 06:48:39 GMT2012-01-01 01:30:45 GMT2012-01-01 06:44:13 GMT
您可以使用表和 cut (参数 breaks =hour(更多信息请参阅?cut.Date )来查找每小时的频率。
MyDatesTable< - table(cut(MyDates,breaks =hour))
MyDatesTable
#
#2012-01-01 00:00:00 2012-01-01 01:00:00 2012-01-01 02:00:00 2012-01-01 03:00:00
#59 73 74 83
#2012-01-01 04:00:00 2012-01-01 05:00:00 2012-01-01 06:00:00 2012-01-01 07:00:00
#52 62 64 33
#或者一个data.frame,如果你喜欢
data.frame(MyDatesTable)
#Var1 Freq
#1 2012-01-01 00:00:00 59
#2 2012-01-01 01:00:00 73
#3 2012-01-01 02:00:00 74
#4 2012-01-01 03:00:00 83
#5 2012-01-01 04:00:00 52
#6 2012-01-01 05:00:00 62
#7 2012-01-01 06:00: 00 64
#8 2012-01-01 07:00:00 33
最后,这里是 MyDatesTable 对象的线图:
plot(MyDatesTable, type =l,xlab =Time,ylab =Freq)
cut 可以处理一系列时间间隔。例如,如果您想每隔30分钟制表一次,则可以轻松地修改 breaks 参数来处理:
data.frame(table(cut(MyDates,breaks =30 mins)))
#Var1 Freq
#1 2012-01-01 00: 00:00 22
#2 2012-01-01 00:30:00 37
#3 2012-01-01 01:00:00 38
#4 2012-01-01 01 :30:00 35
#5 2012-01-01 02:00:00 32
#6 2012-01-01 02:30:00 42
#7 2012-01-01 03:00:00 39
#8 2012-01-01 03:30:00 44
#9 2012-01-01 04:00:00 25
#10 2012-01- 01 04:30:00 27
#11 2012-01-01 05:00:00 33
#12 2012-01-01 05:30:00 29
#13 2012-01 -01 06:00:00 29
#14 2012-01-01 06:30:00 35
#15 2012-01-01 07:00:00 33
$ h3更新
由于你试图用 ggplot2 进行绘图,这里有一种方法(不确定它是否是最好的,因为当我使用基本R的图形时需要)。
创建一个 data.frame 的表格(如上所示)并添加一个虚拟的
$ bMyDatesDF< - data.frame(MyDatesTable,grp = 1)
ggplot(MyDatesDF,aes(Var1,Freq))+ geom_line(aes(group = grp))
I just have a data file with one column of time series:
'2012-02-01 17:42:44' '2012-02-01 17:42:44' '2012-02-01 17:42:44'...I want to split the data up such that I have a count at the top of hour. Say:
'2012-02-01 17:00:00' 20 '2012-02-01 18:00:00' 30The '20' and '30' represent the number of time series entries for that out period. And I want to be able to graph the time vs that 'count'. How can I do this with R?
Here is my current line graph plot.
library(ggplot2) req <- read.table("times1.dat") summary(req) da <- req$V2 db <- req$V1 time <- as.POSIXct(db) png('time_data_errs.png', width=800, height=600) gg <- qplot(time, da) + geom_line() print(gg) dev.off()解决方案It sounds like you want to use cut to figure out how many values occur within an hour.
It's generally helpful if you can provide some sample data. Here's some:
set.seed(1) # So you can get the same numbers as I do MyDates <- ISOdatetime(2012, 1, 1, 0, 0, 0, tz = "GMT") + sample(1:27000, 500) head(MyDates) # [1] "2012-01-01 01:59:29 GMT" "2012-01-01 02:47:27 GMT" "2012-01-01 04:17:46 GMT" # [4] "2012-01-01 06:48:39 GMT" "2012-01-01 01:30:45 GMT" "2012-01-01 06:44:13 GMT"You can use table and cut (with the argument breaks="hour" (see ?cut.Date for more info)) to find the frequencies per hour.
MyDatesTable <- table(cut(MyDates, breaks="hour")) MyDatesTable # # 2012-01-01 00:00:00 2012-01-01 01:00:00 2012-01-01 02:00:00 2012-01-01 03:00:00 # 59 73 74 83 # 2012-01-01 04:00:00 2012-01-01 05:00:00 2012-01-01 06:00:00 2012-01-01 07:00:00 # 52 62 64 33 # Or a data.frame if you prefer data.frame(MyDatesTable) # Var1 Freq # 1 2012-01-01 00:00:00 59 # 2 2012-01-01 01:00:00 73 # 3 2012-01-01 02:00:00 74 # 4 2012-01-01 03:00:00 83 # 5 2012-01-01 04:00:00 52 # 6 2012-01-01 05:00:00 62 # 7 2012-01-01 06:00:00 64 # 8 2012-01-01 07:00:00 33Finally, here's a line plot of the MyDatesTable object:
plot(MyDatesTable, type="l", xlab="Time", ylab="Freq")cut can handle a range of time intervals. For example, if you wanted to tabulate for every 30 minutes, you can easily adapt the breaks argument to handle that:
data.frame(table(cut(MyDates, breaks = "30 mins"))) # Var1 Freq # 1 2012-01-01 00:00:00 22 # 2 2012-01-01 00:30:00 37 # 3 2012-01-01 01:00:00 38 # 4 2012-01-01 01:30:00 35 # 5 2012-01-01 02:00:00 32 # 6 2012-01-01 02:30:00 42 # 7 2012-01-01 03:00:00 39 # 8 2012-01-01 03:30:00 44 # 9 2012-01-01 04:00:00 25 # 10 2012-01-01 04:30:00 27 # 11 2012-01-01 05:00:00 33 # 12 2012-01-01 05:30:00 29 # 13 2012-01-01 06:00:00 29 # 14 2012-01-01 06:30:00 35 # 15 2012-01-01 07:00:00 33Update
Since you were trying to plot with ggplot2, here's one approach (not sure if it is the best since I usually use base R's graphics when I need to).
Create a data.frame of the table (as demonstrated above) and add a dummy "group" variable and plot that as follows:
MyDatesDF <- data.frame(MyDatesTable, grp = 1) ggplot(MyDatesDF, aes(Var1, Freq)) + geom_line(aes(group = grp))这篇关于将时间序列数据分成时间间隔(比如一个小时),然后绘制计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!