问题描述
我正在尝试使用 python 中的 bootstrap 方法计算平均值的置信区间.假设我有一个包含 100 个条目的向量 a,我的目标是使用 bootstrap 计算这 100 个值的平均值及其 95% 置信区间.到目前为止,我已经使用 np.random.choice 函数从我的向量中重新采样了 1000 次.然后对于每个包含 100 个条目的引导向量,我计算了平均值.所以现在我有 1000 个 bootstrap 平均值和来自我的初始向量的单个样本平均值,但我不知道如何从这里开始.我如何使用这些平均值来找到初始向量平均值的置信区间?我对 python 比较陌生,这是我第一次遇到引导程序的方法,因此非常感谢任何帮助.
I'm trying to calculate the confidence interval for the mean value using the method of bootstrap in python. Let say I have a vector a with 100 entries and my aim is to calculate the mean value of these 100 values and its 95% confidence interval using bootstrap. So far I have manage to resample 1000 times from my vector using the np.random.choice function. Then for each bootstrap vector with 100 entries I calculated the mean. So now I have 1000 bootstrap mean values and a single sample mean value from my initial vector but I'm not sure how to proceed from here. How could I use these mean values to find the confidence interval for the mean value of my initial vector? I'm relatively new in python and it's the first time I came across with the method of bootstrap so any help would be much appreciated.
推荐答案
您可以对 1000 个均值的数组进行排序,并使用第 50 个和第 950 个元素作为 90% 自举置信区间.
You could sort the array of 1000 means and use the 50th and 950th elements as the 90% bootstrap confidence interval.
您的 1000 个均值集基本上是均值估计量分布的样本(均值的采样分布).因此,您可以在此处对分布中的样本执行的任何操作.
Your set of 1000 means is basically a sample of the distribution of the mean estimator (the sampling distribution of the mean). So, any operation you could do on a sample from a distribution you can do here.
这篇关于如何使用 Bootstrap 方法计算 95% 置信区间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!