问题描述
我正在用Objective C重写C中的蒙特卡洛模拟,以用于VBA / Excel中的dll。计算中的引擎是创建0到10001之间的随机数,并将其与5000-7000邻域中的变量进行比较。每次迭代使用4-800次,而我使用100000次迭代。因此,每次运行大约有50.000.000代的随机数。
I am rewriting a monte carlo simulation in C from Objective C to use in a dll from VBA/Excel. The "engine" in the calculation is the creation of a random number between 0 and 10001 that is compared to a variable in the 5000-7000 neighbourhood. This is used 4-800 times per iteration and I use 100000 iterations. So that is about 50.000.000 generations of random numbers per run.
在Objective C中,测试没有偏见,但是C代码有很多问题。目标C是C的超集,因此95%的代码是复制粘贴的,很难搞清楚。我昨天和今天整天都经历了很多次,但没有发现任何问题。
While in Objective C the tests showed no bias, I have huge problems with the C code. Objective C is a superset of C, so 95% of the code was copy paste and hard to screw up. I have gone through the rest many times all day yesterday and today and I have found no problems.
我留下了arc4random_uniform()和rand()之间的区别,使用srand(),尤其是因为偏向0到10000的较低数字。我进行的测试与偏向0.5到2%的低于5000左右的数字是一致的。任何其他解释是如果我代码避免了重复,但我猜想是不会的。
I am left with the difference between arc4random_uniform() and rand() with the use of srand(), especially because a bias towards the lower numbers of 0 to 10000. The test I have conducted is consistent with such a bias of .5 to 2 % towards numbers below circa 5000. The any other explanation is if my code avoided repeats which I guess it doesn´t do.
代码真的很简单( spiller1evne和 spiller2evne是介于5500和6500之间的数字):
the code is really simple ("spiller1evne" and "spiller2evne" being a number between 5500 and 6500):
srand((unsigned)time(NULL));
for (j=0;j<antala;++j){
[..]
for (i=1;i<450;i++){
chance = (rand() % 10001);
[..]
if (grey==1) {
if (chance < spiller1evnea) vinder = 1;
else vinder = 2;
}
else{
if (chance < spiller2evnea) vinder = 2;
else vinder = 1;
}
现在我不需要真正的随机性,伪随机性还不错。我只需要大约均匀地分布它(就像5555出现的可能性是5556的两倍就没关系。5500-5599是否比5600-5699的可能性高5%并不重要,如果对0-4000的明显偏差是0.5-2%,而不是6000-9999。
Now I don´t need true randomness, pseudorandomness is quite fine. I only need it to be approximatly even distributed on a cummulative basis (like it doesn´t matter much if 5555 is twice as likely to come out as 5556. It does matter if 5500-5599 is 5% more likely as 5600-5699 and if there is a clear 0.5-2% bias towards 0-4000 than 6000-9999.
首先,rand()是我的问题听起来是否合理,是否存在可以满足我的低需求的简单实现吗?
First, does it sound plausible that rand() is my problem and Is there an easy implementation that meets my low needs?
编辑:如果我的怀疑合理,我可以在此使用任何东西:
if my suspicion is plausible, could I use any on this:
我能复制粘贴此内容作为替换吗(我用C语言编写,并且使用Visual Studio,真的是新手)?
Would I be able to just copy paste this in as a replacement (I am writing in C and using Visual Studio, really novice)?:
#include <stdlib.h>
#define RS_SCALE (1.0 / (1.0 + RAND_MAX))
double drand (void) {
double d;
do {
d = (((rand () * RS_SCALE) + rand ()) * RS_SCALE + rand ()) * RS_SCALE;
} while (d >= 1); /* Round off */
return d;
}
#define irand(x) ((unsigned int) ((x) * drand ()))
编辑2:显然,上面的代码在没有相同偏见的情况下有效,因此,对于那些需要与中间路线相同的人,我建议这样做我在上面描述了。它确实会受到惩罚,因为它会调用rand()3次。所以我仍在寻找一种更快的解决方案。
Well clearly the above code works without the same bias so I would this be a recommendation for anyone who have the same "middle-of-the-road"-need as I described above. It does come with a penalty as it calls rand() three times. So I am still looking for a faster solution.
推荐答案
rand()
函数会在范围[0, RAND_MAX
]中生成 int
。如果像原始代码那样通过模数运算符(%
)将其转换为其他范围,则除非目标范围的大小恰好等于平均除 RAND_MAX + 1
。
The rand()
function generates an int
in the range [0, RAND_MAX
]. If you convert this to a different range via the modulus operator (%
), as your original code does, then that introduces non-uniformity unless the size of your target range happens to evenly divide RAND_MAX + 1
. That sounds like exactly what you see.
您有多种选择,但是如果您想坚持使用 rand()
,那么我建议您采用原来的方法:
You have multiple options, but if you want to stick with something based on rand()
then I suggest this variation on your original approach:
/*
* Returns a pseudo-random int selected from the uniform distribution
* over the half-open interval [0, limit), provided that limit does not
* exceed RAND_MAX.
*/
int range_rand(int limit) {
int rand_bound = (RAND_MAX / limit) * limit;
int r;
while ((r = rand()) >= rand_bound) { /* empty */ }
return r % limit;
}
尽管原则上 rand()的数量
对该函数的每次调用都会无限制地进行调用,实际上,对于较小的 limit
值,平均调用次数仅略大于1,并且对于每个限额
值,平均值小于2。它从[0, RAND_MAX
]的子集中选择初始随机数,消除了前面所述的不均匀性,该子集的大小除以限制
。
Although in principle the number of rand()
calls each call to that function will generate is unbounded, in practice the average number of calls is only slightly greater than 1 for relatively small limit
values, and the average is less than 2 for every limit
value. It removes the non-uniformity I described earlier by choosing the initial random number from a subset of [0, RAND_MAX
] whose size is evenly divided by the limit
.
这篇关于在Monte Carlo模拟中避免基本的rand()偏差?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!