我正在使用ggplot
/ easyGgplot2
创建两组密度图。我想要度量或指示两条曲线之间有多少交点。我什至可以使用没有曲线的任何其他解决方案,只要它可以让我衡量哪些组(几个不同的数据组)更加不同。
在R中有什么简单的方法可以做到这一点吗?
例如,使用此示例,该示例将生成此图
我如何估算两者共有的面积百分比?
ggplot2.density(data=weight, xName='weight', groupName='sex',
legendPosition="top",
alpha=0.5, fillGroupDensity=TRUE )
最佳答案
我喜欢上一个答案,但这可能会更直观,而且我确保使用公共带宽:
library ( "caTools" )
# Extract common bandwidth
Bw <- ( density ( iris$Petal.Width ))$bw
# Get iris data
Sample <- with ( iris, split ( Petal.Width, Species ))[ 2:3 ]
# Estimate kernel densities using common bandwidth
Densities <- lapply ( Sample, density,
bw = bw,
n = 512,
from = -1,
to = 3 )
# Plot
plot( Densities [[ 1 ]], xlim = c ( -1, 3 ),
col = "steelblue",
main = "" )
lines ( Densities [[ 2 ]], col = "orange" )
# Overlap
X <- Densities [[ 1 ]]$x
Y1 <- Densities [[ 1 ]]$y
Y2 <- Densities [[ 2 ]]$y
Overlap <- pmin ( Y1, Y2 )
polygon ( c ( X, X [ 1 ]), c ( Overlap, Overlap [ 1 ]),
lwd = 2, col = "hotpink", border = "n", density = 20)
# Integrate
Total <- trapz ( X, Y1 ) + trapz ( X, Y2 )
(Surface <- trapz ( X, Overlap ) / Total)
SText <- paste ( sprintf ( "%.3f", 100*Surface ), "%" )
text ( X [ which.max ( Overlap )], 1.2 * max ( Overlap ), SText )