问题描述
我有一些时间来事件数据,我需要为模拟模型的子组生成大约200个形状/比例参数.我已经分析了数据,并且最好遵循魏布尔分布.通常,我会使用fitdistrplus包和 fitdist(x,"weibull")
来执行此操作,但是此数据已使用内核匹配进行了匹配,并且我有一个加权值变量,称为 km
,因此需要合并权重,据我所知, fitdist
不能做到这一点.使用我的伽马分布式数据而不是使用 fitdist
时,我使用中的
软件包,效果很好.但是,为韦伯找到一个类似的公式使我难以理解. wtd.mean
和 wtd.var
函数手动进行了计算hsmisc
我一直在测试一些选项,并将它们与fitdist结果进行比较:
test_data<-rweibull(100,0.676,946)fitweibull<-fitdist(test_data,"weibull",method ="mle",lower = c(0,0))fitweibull $估计形状比例0.6981165 935.0907482
我首先对此进行了测试: R中的Weibull分布(ExtDist)
库(bbmle)m1<-mle2(y〜dweibull(shape = exp(lshape),scale = exp(lscale)),data = data.frame(y = test_data),start = list(lshape = 0,lscale = 0))
这给了我 lshape = -0.3919991
和 lscale = 6.852033
我尝试过的另一件事是 EnvStats
包中的 eweibull
.
eweibull<-eweibull(测试数据)eweibull $参数形状比例0.698091 935.239277
但是,尽管这些都可以提供结果,但我仍然认为我无法将我的数据与权重相适应.
我也尝试过使用 ExtDist
包中的类似名称的 eWeibull
(我不确定100%仍然可以使用,但是确实具有weibull函数,负重!).我收到很多关于输入不可计算(NA或无限)的错误消息.如果我使用 map
进行操作,那么 map(test_data,test_km,eWeibull)
的全部100个值都为[[NULL].如果仅使用test_data进行尝试,则会出现一连串与optimx相关的错误.
我还尝试过 propagate
中的 fitDistr
,该错误给出了 weights
应该是特定长度的错误.例如,如果将两者都设置为100,则会出现一个错误,提示 weights 的长度应为94.如果将其设置为94,则表明长度必须为132.>
我需要能够将一组预加权的均值/var/sd等数据传递到计算中,或者要有一个可以获取数据和权重并在计算中使用它们的函数.
经过反复试验,我从 EnvStats
包中将 eweibull
函数编辑为而不是使用 mean(x)
和 sd(x)
,改为使用 wtd.mean(x,w)
和 sqrt(wtd.var(x,w))
.现在,它将运行并输出加权值.
I have some time to event data that I need to generate around 200 shape/scale parameters for subgroups for a simulation model. I have analysed the data, and it best follows a weibull distribution.Normally, I would use the fitdistrplus package and fitdist(x, "weibull")
to do so, however this data has been matched using kernel matching and I have a variable of weighting values called km
and so needs to incorporate a weight, which isn't something fitdist
can do as far as I can tell.With my gamma distributed data instead of using fitdist
I did the calculation manually using the wtd.mean
and wtd.var
functions from the hsmisc
package, which worked well. However, finding a similar formula for the weibull is eluding me.
I've been testing a few options and comparing them against the fitdist results:
test_data <- rweibull(100, 0.676, 946)
fitweibull <- fitdist(test_data, "weibull", method = "mle", lower = c(0,0))
fitweibull$estimate
shape scale
0.6981165 935.0907482
I first tested this: The Weibull distribution in R (ExtDist)
library(bbmle)
m1 <- mle2(y~dweibull(shape=exp(lshape),scale=exp(lscale)),
data=data.frame(y=test_data),
start=list(lshape=0,lscale=0))
which gave me lshape = -0.3919991
and lscale = 6.852033
The other thing I've tried is eweibull
from the EnvStats
package.
eweibull <- eweibull(test_data)
eweibull$parameters
shape scale
0.698091 935.239277
However, while these are giving results, I still don't think I can fit my data with the weights into any of these.
Edit: I have also tried the similarly named eWeibull
from the ExtDist
package (which I'm not 100% sure still works, but does have a weibull function that takes a weight!). I get a lot of error messages about the inputs being non-computable (NA or infinite). If I do it with map
, so map(test_data, test_km, eWeibull)
I get [[NULL] for all 100 values. If I try it just with test_data, I get a long string of errors associated with optimx.
I have also tried fitDistr
from propagate
which gives errors that weights
should be a specific length. For example, if both are set to be 100, I get an error that weights
should be length 94. If I set it to 94, it tells me it has to be length of 132.
I need to be able to pass either a set of pre-weighted mean/var/sd etc data into the calculation, or have a function that can take data and weights and use them both in the calculation.
After much trial and error, I edited the eweibull
function from the EnvStats
package to instead of using mean(x)
and sd(x)
, to instead use wtd.mean(x,w)
and sqrt(wtd.var(x, w))
. This now runs and outputs weighted values.
这篇关于加权数据的威布尔分布的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!