c - 雷神平方根:准确性

许多来源都描述了计算反平方根的“魔术”方法，该方法显然可以追溯到Quake游戏。维基百科上有一篇不错的文章:https://en.wikipedia.org/wiki/Fast_inverse_square_root

我特别发现以下内容是该算法的非常不错的记录和分析:https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf

我试图在本文中复制其中一些结果，但准确性存在问题。用C编码的算法如下:

#include <math.h>
#include <stdio.h>

float Q_rsqrt(float number) {
  long i;
  float x2, y;
  const float threehalfs = 1.5F;

  x2 = number * 0.5F;
  y = number;
  i = *(long *) &y;
  i = 0x5f3759df - (i >> 1);
  y = *(float *) &i;
  y = y * (threehalfs - (x2 * y * y));
  // y = y * (threehalfs - (x2 * y * y));
  return y;
}

paper指出，对于所有正正浮点数，相对误差最多为0.0017522874。 (有关代码，请参见附录2，以及第1.4节中的讨论。)

但是，当我“插入”数字1.4569335e-2F时，得到的错误大于此预测的容差:

int main ()
{

  float f = 1.4569335e-2F;

  double tolerance = 0.0017522874;
  double actual    = 1.0 / sqrt(f);
  float  magic     = Q_rsqrt(f);
  double err       = fabs (sqrt(f) * (double) magic - 1);

  printf("Input    : %a\n", f);
  printf("Actual   : %a\n", actual);
  printf("Magic    : %a\n", magic);
  printf("Err      : %a\n", err);
  printf("Tolerance: %a\n", tolerance);
  printf("Passes   : %d\n", err <= tolerance);

  return 0;
}

输出为:

Input    : 0x1.dd687p-7
Actual   : 0x1.091cc953ea828p+3
Magic    : 0x1.08a5dcp+3
Err      : 0x1.cb5b716b7b6p-10
Tolerance: 0x1.cb5a044e0581p-10
Passes   : 0

因此，此特定输入似乎违反了该论文中提出的主张。

我想知道这是否是纸张本身的问题，还是我在编码中犯了一个错误。我会很感激任何反馈!

最佳答案

让我们尝试一些代码来重新计算相对误差的界限，并显示它比thesis of Matthew Robertson中的误差稍大。实际上，正如在@squeamishossifrage的答案中首先注意到并在the thesis of Matthew Robertson中指出的那样，此实现是Quake III的源代码中公开的一种实现。特别是，在第561行的文件q_math.c中，可以在Quake III的源代码中找到Quake III常数的原始值。

首先，需要对代码进行修改以使其在64位平台上工作。唯一需要修改的是整数类型:long与平台无关。在我的Linux计算机上，sizeof(long)返回8 ...在第49页的论文中进行了更新，uint32_t类型将确保整数的类型与float的大小相同。

这是要由gcc main.c -o main -lm -Wall编译并由./main运行的代码:

#include <math.h>
#include <stdio.h>
#include <inttypes.h>

float Q_rsqrt(float number) {
    uint32_t i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y = number;
    i = *(uint32_t *) &y;
    i = 0x5f3759df - (i >> 1); //  0x5f3759df 0x5f375a86
    y = *(float *) &i;
    y = y * (threehalfs - (x2 * y * y));
    // y = y * (threehalfs - (x2 * y * y));
    return y;
}

int main ()
{

    printf("%ld %ld\n",sizeof(long),sizeof(uint32_t));

    uint32_t i;
    float y;
    double e, max = 0.0;
    float maxval=0;
    for(i = 0x0000000; i < 0x6f800000; i++) {
        y = *(float *) &i;
        if(y>1e-30){
            e = fabs(sqrt((double)y)*(double)Q_rsqrt(y) - 1);
            if(e > max){
                max = e;
                maxval=y;
            }
        }
    }
    printf("On value %2.8g == %a\n", maxval, maxval);
    printf("The bound is %2.12g == %a\n", max, max);

    return 0;
}

对于边界，我获得了0.0017523386721 == 0x1.cb5d752717ep-10。正如您所注意到的，它比论文中所报告的(0.001752287)稍大。使用float而不是double评估错误不会大大改变结果。