差分运算符(类似于导数运算符)和求和运算符(类似于积分运算符)可用于更改算法,因为它们是逆运算。

Sum of (difference of y) = y
Difference of (sum of y) = y

下面是在 c 程序中以这种方式使用它们的示例。

这个 c 程序演示了制作正方形数组的三种方法。
  • 第一种方法是简单明显的方法 y = x*x
  • 第二种方法使用等式 (difference in y) = (x0 + x1)*(difference in x)
  • 第三种方法是相反的,使用等式 (sum of y) = x(x+1)(2x+1)/6

  • 第二种方法始终比第一种方法稍快,即使我没有费心优化它。我想如果我更加努力,我可以让它变得更好。

    第三种方法总是慢一倍,但这并不意味着基本思想是愚蠢的。我可以想象,对于 y = x*x 以外的某些功能,这种方法可能会更快。还有一个整数溢出问题。

    尝试所有这些转换非常有趣,所以现在我想知道我可以使用哪些其他数学运算符对来转换算法?

    这是代码:
    #include <stdio.h>
    #include <time.h>
    
    #define tries 201
    #define loops 100000
    
    void printAllIn(unsigned int array[tries]){
    unsigned int index;
    
    for (index = 0; index < tries; ++index)
        printf("%u\n", array[index]);
    }
    
    int main (int argc, const char * argv[]) {
            /*
    
        Goal, Calculate an array of squares from 0 20 as fast as possible
    
            */
    
        long unsigned int obvious[tries];
        long unsigned int sum_of_differences[tries];
        long unsigned int difference_of_sums[tries];
    
        clock_t time_of_obvious1;
        clock_t time_of_obvious0;
    
        clock_t time_of_sum_of_differences1;
        clock_t time_of_sum_of_differences0;
    
        clock_t time_of_difference_of_sums1;
        clock_t time_of_difference_of_sums0;
    
        long unsigned int j;
        long unsigned int index;
        long unsigned int sum1;
        long unsigned int sum0;
        long signed int signed_index;
    
        time_of_obvious0 = clock();
        for (j = 0; j < loops; ++j)
        for (index = 0; index < tries; ++index)
            obvious[index] = index*index;
        time_of_obvious1 = clock();
    
            time_of_sum_of_differences0 = clock();
        for (j = 0; j < loops; ++j)
        for (index = 1, sum_of_differences[0] = 0; index < tries; ++index)
            sum_of_differences[index] = sum_of_differences[index-1] + 2 * index - 1;
        time_of_sum_of_differences1 = clock();
    
        time_of_difference_of_sums0 = clock();
        for (j = 0; j < loops; ++j)
        for (signed_index = 0, sum0 = 0; signed_index < tries; ++signed_index) {
            sum1 = signed_index*(signed_index+1)*(2*signed_index+1);
            difference_of_sums[signed_index] = (sum1 - sum0)/6;
            sum0 = sum1;
        }
        time_of_difference_of_sums1 = clock();
    
        // printAllIn(obvious);
        printf(
           "The obvious approach y = x*x took, %f seconds\n",
           ((double)(time_of_obvious1 - time_of_obvious0))/CLOCKS_PER_SEC
           );
        // printAllIn(sum_of_differences);
        printf(
           "The sum of differences approach y1 = y0 + 2x - 1 took, %f seconds\n",
           ((double)(time_of_sum_of_differences1 - time_of_sum_of_differences0))/CLOCKS_PER_SEC
           );
        // printAllIn(difference_of_sums);
        printf(
           "The difference of sums approach y = sum1 - sum0, sum = (x - 1)x(2(x - 1) + 1)/6 took, %f seconds\n",
           (double)(time_of_difference_of_sums1 - time_of_difference_of_sums0)/CLOCKS_PER_SEC
           );
    
        return 0;
    }
    

    最佳答案

    这里有两类优化:strength reductionpeephole 优化。

    强度降低是用更便宜的函数替换“昂贵的”数学函数的常用术语 - 例如,用两个 logarithm table lookups 替换乘法,加法,然后逆对数查找以找到最终结果。

    窥孔优化是用左移替换乘法之类的常用术语。对于乘以 2 的幂的特定情况,某些 CPU 具有针对这些操作的简单指令,其运行速度比通用整数乘法更快。

    您还可以执行单个算法的优化。你可能会写 a * b ,但有 many different ways to perform multiplication ,不同的算法在不同的条件下表现更好或更差。其中许多决定是由芯片设计人员做出的,但任意精度整数库会根据可用原语的优点做出自己的选择。

    关于c - 可以使用哪些其他数学运算符来转换算法,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/9106412/

    10-14 08:26