问题描述
以下是我的代码:
library(fitdistrplus)
s <- c(11, 4, 2, 9, 3, 1, 2, 2, 3, 2, 2, 5, 8,3, 15, 3, 9, 22, 0, 4, 10, 1, 9, 10, 11,
2, 8, 2, 6, 0, 15, 0 , 2, 11, 0, 6, 3, 5, 0, 7, 6, 0, 7, 1, 0, 6, 4, 1, 3, 5,
2, 6, 0, 10, 6, 4, 1, 17, 0, 1, 0, 6, 6, 1, 5, 4, 8, 0, 1, 1, 5, 15, 14, 8, 1,
3, 2, 9, 4, 4, 1, 2, 18, 0, 0, 10, 5, 0, 5, 0, 1, 2, 0, 5, 1, 1, 2, 3, 7)
o <- fitdist(s, "gamma", method = "mle")
summary(o)
plot(o)
并且错误说:
and the error says:
fitdist(s, "gamma", method = "mle") 中的错误:函数 mle未能估计参数,错误代码为 100
推荐答案
Gamma 分布不允许零值(对于 0 的响应,似然将评估为零,对数似然将是无限的),除非形状参数正好是 1.0(即指数分布 - 见下文)......这是一个统计/数学问题,而不是一个编程问题.您将不得不为零值找到一些明智的做法.根据对您的应用程序有意义的内容,您可以(例如)
The Gamma distribution doesn't allow zero values (the likelihood will evaluate to zero, and the log-likelihood will be infinite, for a response of 0) unless the shape parameter is exactly 1.0 (i.e., an exponential distribution - see below) ... that's a statistical/mathematical problem, not a programming problem. You're going to have to find something sensible to do about the zero values. Depending on what makes sense for your application, you could (for example)
- 选择不同的发行版进行测试
- 排除零值 (
fitdist(s[s>0], ...)
) - 将零值设置为一些合理的非零值 (
fitdist(replace(s,which(s==0),0.1),...)
其中哪个(如果有)最好取决于您的应用.
@Sandipan Dey 的第一个答案(在数据集中留下零)看似是有道理的,但实际上它卡在形状参数等于 1 处.
@Sandipan Dey's first answer (leaving the zeros in the data set) appears to make sense, but in fact it gets stuck at the shape parameter equal to 1.
o <- fitdist(s, "exp", method = "mle")
给出与@Sandipan 代码相同的答案(除了它估计速率=0.2161572,为 Gamma 分布估计的尺度参数的倒数 = 4.626262 - 这只是参数化的变化).如果您选择拟合指数而不是 Gamma,那很好 - 但您应该有目的地这样做,而不是偶然...
gives the same answer as @Sandipan's code (except that it estimates rate=0.2161572, the inverse of the scale parameter=4.626262 that's estimated for the Gamma distribution - this is just a change in parameterization). If you choose to fit an exponential instead of a Gamma, that's fine - but you should do it on purpose, not by accident ...
为了说明包含零的拟合可能无法按预期工作,我将构建自己的负对数似然函数并显示每种情况的似然面.
To illustrate that the zeros-included fit may not be working as expected, I'll construct my own negative log-likelihood function and display the likelihood surface for each case.
mfun <- function(sh,sc,dd=s) {
-sum(dgamma(dd,shape=sh,scale=sc,log=TRUE))
}
library(emdbook) ## for curve3d() helper function
包含零的曲面:
cc1 <- curve3d(mfun(x,y),
## set up "shape" limits" so we evaluate
## exactly shape=1.000 ...
xlim=c(0.55,3.55),
n=c(41,41),
ylim=c(2,5),
sys3d="none")
png("gammazero1.png")
with(cc1,image(x,y,z))
dev.off()
在这种情况下,表面仅定义为 shape=1(即指数分布);白色区域代表无限对数似然.不是 shape=1 是最佳合身,而是唯一合身...
In this case the surface is only defined at shape=1 (i.e. an exponential distribution); the white regions represent infinite log-likelihoods. It's not that shape=1 is the best fit, it's that it's the only fit ...
排除零的曲面:
cc2 <- curve3d(mfun(x,y,dd=s[s>0]),
## set up "shape" limits" so we evaluate
## exactly shape=1.000 ...
xlim=c(0.55,3.55),
n=c(41,41),
ylim=c(2,5),
sys3d="none")
png("gammazero2.png")
with(cc2,image(x,y,z))
with(cc2,contour(x,y,z,add=TRUE))
abline(v=1.0,lwd=2,lty=2)
dev.off()
这篇关于伽马分布拟合误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!