问题描述
我在R中使用了具有相同断点的geom_hist和直方图,但是得到了不同的图形.我进行了快速搜索,有谁知道定义的中断是什么以及为什么它们会有所不同
I am using both geom_hist and histogram in R with the same breakpoints but I get different graphs. I did a quick search, does anyone know what the definition breaks are and why they would be a difference
它们产生两个不同的情节.
These produce two different plots.
set.seed(25)
data <- data.frame(Mos=rnorm(500, mean = 25, sd = 8))
data$Mos<-round(data$Mos)
pAge <- ggplot(data, aes(x=Mos))
pAge + geom_histogram(breaks=seq(0, 50, by = 2))
hist(data$Mos,breaks=seq(0, 50, by = 2))
谢谢
推荐答案
要在ggplot2
中获得相同的直方图,请在scale_x_continuous
中指定breaks
,在geom_histogram
中指定binwidth
.
To get the same histogram in ggplot2
you specify the breaks
inside scale_x_continuous
and binwidth
inside geom_histogram
.
此外,hist
和ggplot2
中的直方图使用不同的默认值来创建间隔:
Additionally, hist
and histograms in ggplot2
use different defaults to create the intervals:
**hist** **ggplot2**
freq1 Freq freq2 Freq
1 (0,2] 0 [0,2) 0
2 (2,4] 2 [2,4) 2
3 (4,6] 2 [4,6) 1
4 (6,8] 1 [6,8) 2
5 (8,10] 6 [8,10) 2
6 (10,12] 9 [10,12) 7
7 (12,14] 24 [12,14) 17
8 (14,16] 27 [14,16) 26
9 (16,18] 39 [16,18) 31
10 (18,20] 48 [18,20) 46
11 (20,22] 52 [20,22) 43
12 (22,24] 38 [22,24) 57
13 (24,26] 44 [24,26) 36
14 (26,28] 46 [26,28) 52
15 (28,30] 39 [28,30) 39
16 (30,32] 31 [30,32) 33
17 (32,34] 30 [32,34) 26
18 (34,36] 24 [34,36) 29
19 (36,38] 18 [36,38) 27
20 (38,40] 9 [38,40) 12
21 (40,42] 5 [40,42) 6
22 (42,44] 4 [42,44) 0
23 (44,46] 1 [44,46) 5
24 (46,48] 1 [46,48) 0
25 (48,50] 0 [48,50) 1
我包括了参数right = FALSE
,因此直方图间隔与ggplot2
中的一样是左关闭(右打开)的.我在两个图中都添加了标签,因此更容易检查间隔是否相同.
I included the argument right = FALSE
so the histogram intervalss are left-closed (right open) as they are in ggplot2
. I added the labels in both plots, so it is easier to check the intervals are the same.
ggplot(data, aes(x = Mos))+
geom_histogram(binwidth = 2, colour = "black", fill = "white")+
scale_x_continuous(breaks = seq(0, 50, by = 2))+
stat_bin(binwidth = 2, aes(label=..count..), vjust=-0.5, geom = "text")
hist(data$Mos,breaks=seq(0, 50, by = 2), labels =TRUE, right =FALSE)
要检查每个仓中的频率,请执行以下操作:
To check the frequencies in each bin:
freq <- cut(data$Mos, breaks = seq(0, 50, by = 2), dig.lab = 4, right = FALSE)
as.data.frame(table(frecuencias))
这篇关于R hist vs geom_hist断点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!