差分运算符(类似于导数运算符)和求和运算符(类似于积分运算符)可用于更改算法,因为它们是逆运算。
Sum of (difference of y) = y
Difference of (sum of y) = y
下面是在 c 程序中以这种方式使用它们的示例。
这个 c 程序演示了制作正方形数组的三种方法。
y = x*x
。 (difference in y) = (x0 + x1)*(difference in x)
。 (sum of y) = x(x+1)(2x+1)/6
。 第二种方法始终比第一种方法稍快,即使我没有费心优化它。我想如果我更加努力,我可以让它变得更好。
第三种方法总是慢一倍,但这并不意味着基本思想是愚蠢的。我可以想象,对于
y = x*x
以外的某些功能,这种方法可能会更快。还有一个整数溢出问题。尝试所有这些转换非常有趣,所以现在我想知道我可以使用哪些其他数学运算符对来转换算法?
这是代码:
#include <stdio.h>
#include <time.h>
#define tries 201
#define loops 100000
void printAllIn(unsigned int array[tries]){
unsigned int index;
for (index = 0; index < tries; ++index)
printf("%u\n", array[index]);
}
int main (int argc, const char * argv[]) {
/*
Goal, Calculate an array of squares from 0 20 as fast as possible
*/
long unsigned int obvious[tries];
long unsigned int sum_of_differences[tries];
long unsigned int difference_of_sums[tries];
clock_t time_of_obvious1;
clock_t time_of_obvious0;
clock_t time_of_sum_of_differences1;
clock_t time_of_sum_of_differences0;
clock_t time_of_difference_of_sums1;
clock_t time_of_difference_of_sums0;
long unsigned int j;
long unsigned int index;
long unsigned int sum1;
long unsigned int sum0;
long signed int signed_index;
time_of_obvious0 = clock();
for (j = 0; j < loops; ++j)
for (index = 0; index < tries; ++index)
obvious[index] = index*index;
time_of_obvious1 = clock();
time_of_sum_of_differences0 = clock();
for (j = 0; j < loops; ++j)
for (index = 1, sum_of_differences[0] = 0; index < tries; ++index)
sum_of_differences[index] = sum_of_differences[index-1] + 2 * index - 1;
time_of_sum_of_differences1 = clock();
time_of_difference_of_sums0 = clock();
for (j = 0; j < loops; ++j)
for (signed_index = 0, sum0 = 0; signed_index < tries; ++signed_index) {
sum1 = signed_index*(signed_index+1)*(2*signed_index+1);
difference_of_sums[signed_index] = (sum1 - sum0)/6;
sum0 = sum1;
}
time_of_difference_of_sums1 = clock();
// printAllIn(obvious);
printf(
"The obvious approach y = x*x took, %f seconds\n",
((double)(time_of_obvious1 - time_of_obvious0))/CLOCKS_PER_SEC
);
// printAllIn(sum_of_differences);
printf(
"The sum of differences approach y1 = y0 + 2x - 1 took, %f seconds\n",
((double)(time_of_sum_of_differences1 - time_of_sum_of_differences0))/CLOCKS_PER_SEC
);
// printAllIn(difference_of_sums);
printf(
"The difference of sums approach y = sum1 - sum0, sum = (x - 1)x(2(x - 1) + 1)/6 took, %f seconds\n",
(double)(time_of_difference_of_sums1 - time_of_difference_of_sums0)/CLOCKS_PER_SEC
);
return 0;
}
最佳答案
这里有两类优化:strength reduction 和 peephole 优化。
强度降低是用更便宜的函数替换“昂贵的”数学函数的常用术语 - 例如,用两个 logarithm table lookups 替换乘法,加法,然后逆对数查找以找到最终结果。
窥孔优化是用左移替换乘法之类的常用术语。对于乘以 2 的幂的特定情况,某些 CPU 具有针对这些操作的简单指令,其运行速度比通用整数乘法更快。
您还可以执行单个算法的优化。你可能会写 a * b
,但有 many different ways to perform multiplication ,不同的算法在不同的条件下表现更好或更差。其中许多决定是由芯片设计人员做出的,但任意精度整数库会根据可用原语的优点做出自己的选择。
关于c - 可以使用哪些其他数学运算符来转换算法,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/9106412/