问题描述
我尝试使用 Parallel.For()
计算列表的平均值.我决定反对它,因为它比简单的串行版本慢四倍.然而,我对它产生的结果与串行结果不同这一事实很感兴趣,我认为了解原因会很有启发性.
我的代码是:
public static double Mean(this IList list){双和 = 0.0;Parallel.For(0, list.Count, i => {双初始总和;双增量总和;SpinWait spinWait = new SpinWait();//尝试递增总和,直到循环发现初始总和没有变化,以便它可以安全地用递增的总和替换它.而(真){初始总和 = 总和;incrementedSum = initialSum + list[i];if (initialSum == Interlocked.CompareExchange(ref sum, incrementedSum, initialSum)) 中断;spinWait.SpinOnce();}});返回总和/列表.计数;}
当我在 2000000 个点的随机序列上运行代码时,我得到的结果与序列平均值的最后 2 位数不同.
我搜索了stackoverflow,发现了这个:VB.NET 在 Parallel.for Synclock 内的嵌套循环中运行总和丢失信息.然而,我的情况与那里描述的情况不同.有一个线程局部变量 temp
是不准确的原因,但我使用了根据教科书 Interlocked.CompareExchange()
模式更新的单个总和(我希望).由于性能不佳,这个问题当然没有实际意义(这让我感到惊讶,但我知道开销),但我很好奇是否可以从这个案例中学到一些东西.
感谢您的想法.
使用 double 是根本问题,使用 long 代替.你得到的结果实际上是正确的,但这永远不会让程序员高兴.
您发现浮点数学具有交流性,但不具有联想性.或者换句话说,a + b == b + a
但a + b + c != a + c + b
.在您的代码中暗示数字的添加顺序是非常随机的.
这个 C++ 问题也谈到了它.>
I experimented with calculating the mean of a list using Parallel.For()
. I decided against it as it is about four times slower than a simple serial version. Yet I am intrigued by the fact that it does not yield exactly the same result as the serial one and I thought it would be instructive to learn why.
My code is:
public static double Mean(this IList<double> list)
{
double sum = 0.0;
Parallel.For(0, list.Count, i => {
double initialSum;
double incrementedSum;
SpinWait spinWait = new SpinWait();
// Try incrementing the sum until the loop finds the initial sum unchanged so that it can safely replace it with the incremented one.
while (true) {
initialSum = sum;
incrementedSum = initialSum + list[i];
if (initialSum == Interlocked.CompareExchange(ref sum, incrementedSum, initialSum)) break;
spinWait.SpinOnce();
}
});
return sum / list.Count;
}
When I run the code on a random sequence of 2000000 points, I get results that are different in the last 2 digits to the serial mean.
I searched stackoverflow and found this: VB.NET running sum in nested loop inside Parallel.for Synclock loses information. My case, however, is different to the one described there. There a thread-local variable temp
is the cause of inaccuracy, but I use a single sum that is updated (I hope) according to the textbook Interlocked.CompareExchange()
pattern. The question is of course moot because of the poor performance (which surprises me, but I am aware of the overhead), yet I am curious whether there is something to be learnt from this case.
Your thoughts are appreciated.
Using double is the underlying problem, you can feel better about the synchronization not being the cause by using long instead. The results you got are in fact correct but that never makes a programmer happy.
You discovered that floating point math is communicative but not associative. Or in other words, a + b == b + a
but a + b + c != a + c + b
. Implicit in your code that the order in which the numbers are added is quite random.
This C++ question talks about it as well.
这篇关于Parallel.For() 和 Interlocked.CompareExchange():性能较差,结果与串行版本略有不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!