本文介绍了优化者被高估了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 优化器被高估 我不久前开始学习ASM,以提高我对 硬件架构的理解以及优化C代码的能力。我的第一次实验的结果至少令人惊讶。在我的ASM书中阅读了循环章节 后,我想测试现代C编译器是否真的像通常声称的那样聪明。我选择了一个最简单的循环:调用 putchar 100次。第一个函数(foo)使用典型的C风格循环来测试编译器将优化比人类更好的b $ b优化的假设。第二个函数(bar)基于我新获得的知识, 循环基本上是用C语言编写的。现在我确定我是否在这里问过 哪一个更多高效的你们所有人都会回复编译器最多可能会在两种情况下产生相同的代码。 (我已经读过这样的说法 无数次)。好吧,看看下面的ASM输出,看看你的假设是多么错误 。 / * C Code * / void foo(无效) { int i; for(i = 0; i< 100; i ++)putchar(''a''); } void bar(无效) { int i = 100; 做{ putchar(''a''); } while( - -i); } 正如我所说,该死的很简单。没有令人讨厌的副作用,无法访问全局 变量等。优化器没有任何借口。它应该在两种情况下生成最优的 代码。但看到结果: / * x86 / Windows上的GCC 4.3.0,-O2 * / / * foo * / L7: subl $ 12,%esp pushl $ 97 来电_putchar 包含%ebx addl $ 16,%esp cmpl $ 100,%ebx jne L7 / * bar * / L2: subl $ 12,%esp pushl $ 97 call _putchar addl $ 16,%esp decl%ebx jne L2 评论:请参阅,即使是最新版本可能最广泛使用的 编译器无法正确优化最简单的循环!至少GCC 理解条形循环,所以我的像ASM一样写C。优化工作。 此时你可能想知道当GCC已经失败时,普通C编译器将会做什么可怕的事情。这是令人毛骨悚然的结果: / * lccwin32,优化* / / * foo * / _ $ 4: pushl $ 97 来电_putchar popl%ecx 包含%edi cmpl $ 100,%edi jl _ $ 4 / * bar * / _ $ 10: pushl $ 97 来电_putchar popl%ecx movl%edi,%eax decl%eax movl%eax,%edi 或%eax,%eax jne _ $ 10 评论:lcc无法像GCC一样优化循环,但它实际上为ASM风格的循环生成了更糟糕的代码,从而增加了对伤害的侮辱! 所以你甚至不能自己优化循环! / * MS Visual C ++ 6 / O2 * / 对于这个编译器,我不得不用调用替换putchar调用自定义 my_putchar函数否则编译器用 直接操作系统API替换putchar调用。虽然这是一个很好的优化,但它不是这个测试的主题,只会让得到的asm更难阅读,所以我对b $ b施加压力。 / * foo * / jmp SHORT $ L833 $ L834: mov eax,DWORD PTR _i $ [ebp] 添加eax,1 mov DWORD PTR _i $ [ebp],eax $ L833: cmp DWORD PTR _i $ [ebp],100 jge SHORT $ L835 push 97 call _my_putchar 添加esp,4 jmp SHORT $ L834 $ L835: / * bar * / $ L840: 推97 调用_my_putchar 添加esp,4 mov eax,DWORD PTR _i $ [ebp] sub eax,1 mov DWORD PTR _i $ [ebp],eax cmp DWORD PTR _i $ [ebp],0 jne SHORT $ L840 评论:令人惊讶的是,这个编译器还找到了另一种方式拧螺丝 起来。您是否认为每个编译器为 这样一个简单的构造生成不同的代码? 我希望您同意该野兽的编译器值得奖励最差的 显示"为了这个烂摊子。 MS编译器仍然很糟糕吗? Optimizers are overrated I started learning ASM not long ago to improve my understanding of thehardware architecture and my ability to optimize C code. The results of myfirst experiment were surprising to say at least. After reading the chapteron loops in my ASM book I wanted to test whether modern C compilers areactually as smart as commonly claimed. I chose a most simple loop: callingputchar 100 times. The first function (foo) uses a typical C style loop totest the assumption that "the compiler will optimize that better than anyhuman could". The second function (bar) is based my newly gained knowledge,the loop is basically ASM written in C. Now I am certain if I asked herewhich one is more efficient all you guys would reply "the compiler will mostlikely generate the same code in both cases" (I have read such claimscountless times here). Well, look at the ASM output below to see how wrongyour assumption is. /* C Code */ void foo(void){int i; for (i = 0; i < 100; i++) putchar(''a'');}void bar(void){int i = 100; do {putchar(''a'');} while (--i);} As I said, damn simple. No nasty side effects, no access to globalvariables, etc. The optimizer has no excuses. It should generate optimialcode in both cases. But see the result:/* GCC 4.3.0 on x86/Windows, -O2 */ /* foo */L7:subl $12, %esppushl $97call _putchar incl %ebxaddl $16, %espcmpl $100, %ebxjne L7/* bar */L2:subl $12, %esppushl $97call _putcharaddl $16, %esp decl %ebxjne L2Comment: See, even the most recent version of the probably most widely usedcompiler can not correctly optimize a most simple loop! At least GCCunderstood the bar loop, so my "write C like ASM" optimization worked. At this point you might wonder what horrible things an average C compilerwill do when GCC already fails so badly. Here is the gruesome result:/* lccwin32, optimize on */ /* foo */_$4:pushl $97call _putcharpopl %ecx incl %edicmpl $100,%edijl _$4/* bar */_$10:pushl $97call _putcharpopl %ecx movl %edi,%eaxdecl %eaxmovl %eax,%edior %eax,%eaxjne _$10Comment: lcc is unable to optimize the loop just like GCC, but it addsinsults to injury by actually generating worse code for the ASM-style loop!So you cannot even optimize the loop yourself!/* MS Visual C++ 6 /O2 */ For this compiler I had to replace the putchar call with a call to a custommy_putchar function otherwise the compiler replaces the putchar calls withdirect OS API stuff. While this is a good optimization it is not thesubject of this test, and only makes the resulting asm harder to read, so Isupressed that./* foo */ jmp SHORT $L833$L834:mov eax, DWORD PTR _i$[ebp]add eax, 1mov DWORD PTR _i$[ebp], eax$L833:cmp DWORD PTR _i$[ebp], 100jge SHORT $L835 push 97call _my_putcharadd esp, 4 jmp SHORT $L834$L835:/* bar */ $L840:push 97call _my_putcharadd esp, 4 mov eax, DWORD PTR _i$[ebp]sub eax, 1mov DWORD PTR _i$[ebp], eaxcmp DWORD PTR _i$[ebp], 0jne SHORT $L840Comment: Amazingly enough, this compiler has found yet another way to screwup. Would you have thought that each compiler generates different code forsuch a simple construct?I hope you agree that the compiler of the beast deserves the award "Worst ofShow" for this mess. Are MS compilers still this bad? 推荐答案 这篇关于优化者被高估了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-17 00:42