问题描述
今天我和我的一个朋友进行了讨论,我们就编译器优化辩论了几个小时。
Today I had a discussion with a friend of mine and we debated for a couple of hours about "compiler optimization".
我辩解说,有时,编译器优化可能会引入错误或至少不需要的行为。
I defended the point that sometimes, a compiler optimization might introduce bugs or at least, undesired behavior.
我的朋友完全不同意,说编译器是由聪明的人聪明的东西,因此可以从不出错。
My friend totally disagreed, saying that "compilers are built by smart people and do smart things" and thus, can never go wrong.
他根本不相信我,但我必须承认我缺乏现实生活中的例子,以加强我的观点。
He didn't convince me at all, but I have to admit I lack of real-life examples to strengthen my point.
谁在这里?如果我是,你有任何现实生活中的例子,编译器优化产生的错误在生成的软件?如果我错误,我应该停止编程和学习钓鱼吗?
Who is right here? If I am, do you have any real-life example where a compiler optimization produced a bug in the resulting software? If I'm mistaking, should I stop programming and learn fishing instead?
推荐答案
编译器优化可能会引入错误或不良行为。
Compiler optimizations can introduce bugs or undesirable behaviour. That's why you can turn them off.
一个例子:编译器可以优化对内存位置的读/写访问,执行消除重复读取或重复写入的操作,或重新排序某些操作。如果所讨论的存储器位置仅由单个线程使用并且实际上是存储器,那么可以确定。但是如果存储器位置是硬件设备IO寄存器,则重新排序或消除写入可能完全错误。在这种情况下,你通常必须编写代码知道编译器可能优化它,因此知道朴素的方法不工作。
One example: a compiler can optimize the read/write access to a memory location, doing things like eliminating duplicate reads or duplicate writes, or re-ordering certain operations. If the memory location in question is only used by a single thread and is actually memory, that may be ok. But if the memory location is a hardware device IO register, then re-ordering or eliminating writes may be completely wrong. In this situation you normally have to write code knowing that the compiler might "optimize" it, and thus knowing that the naive approach doesn't work.
更新:正如Adam Robinson在评论中指出的,我上面描述的场景更多是编程错误,而不是优化程序错误。但是我试图说明的一点是,一些程序,否则是正确的,结合一些优化,否则工作正常,可以在程序中引入错误,当它们组合在一起。在某些情况下,语言规范说你必须这样做,因为这些类型的优化可能会发生,你的程序将失败,在这种情况下,这是一个错误的代码。但有时编译器有一个(通常是可选的)优化功能,可以生成不正确的代码,因为编译器试图太难以优化代码或无法检测到优化是不合适的。在这种情况下,程序员必须知道何时可以安全地打开有问题的优化。
Update: As Adam Robinson pointed out in a comment, the scenario I describe above is more of a programming error than an optimizer error. But the point I was trying to illustrate is that some programs, which are otherwise correct, combined with some optimizations, which otherwise work properly, can introduce bugs in the program when they are combined together. In some cases the language specification says "You must do things this way because these kinds of optimizations may occur and your program will fail", in which case it's a bug in the code. But sometimes a compiler has a (usually optional) optimization feature that can generate incorrect code because the compiler is trying too hard to optimize the code or can't detect that the optimization is inappropriate. In this case the programmer must know when it is safe to turn on the optimization in question.
另一个例子:
,其中一个潜在的NULL指针被解除引用之前,该指针的测试为空。然而,在一些情况下,可以将存储器映射到地址零,因此允许取消引用成功。编译器在注意到指针被取消引用时,假定它不能为NULL,然后删除NULL测试以及该分支中的所有代码。 这向代码中引入了安全漏洞,因为函数将继续使用包含攻击者提供的数据的无效指针。对于指针合法为空并且内存未映射到地址为零的情况,内核仍然像以前一样执行OOPS。所以在优化之前,代码包含一个错误;之后它包含两个,其中一个允许本地根漏洞。
Another example:The linux kernel had a bug where a potentially NULL pointer was being dereferenced before a test for that pointer being null. However, in some cases it was possible to map memory to address zero, thus allowing the dereferencing to succeed. The compiler, upon noticing that the pointer was dereferenced, assumed that it couldn't be NULL, then removed the NULL test later and all the code in that branch. This introduced a security vulnerability into the code, as the function would proceed to use an invalid pointer containing attacker-supplied data. For cases where the pointer was legitimately null and the memory wasn't mapped to address zero, the kernel would still OOPS as before. So prior to optimization the code contained one bug; after it contained two, and one of them allowed a local root exploit.
。它讨论了可能的各种优化,从做什么硬件做,陷阱所有可能的未定义的行为到做不允许的任何东西。
The securecoding.cert.org website has a document called "Dangerous Optimizations and the Loss of Causality" by Robert C. Seacord which lists a lot of optimizations that introduce (or expose) bugs in programs. Google Cache link. It discusses the various kinds of optimizations that are possible, from "doing what the hardware does" to "trap all possible undefined behaviour" to "do anything that's not disallowed".
-
检查溢出
Checking for overflow
// fails because the overflow test gets removed
if (ptr + len < ptr || ptr + len > max) return EINVAL;
使用溢出算法:
Using overflow artithmetic at all:
// The compiler optimizes this to an infinite loop
for (i = 1; i > 0; i += i) ++j;
清除敏感信息的记忆:
Clearing memory of sensitive information:
// the compiler can remove these "useless writes"
memset(password_buffer, 0, sizeof(password_buffer));
这里的问题是编译器有几十年来,在优化方面不那么积极,所以一代代的C程序员学习和理解固定大小的二进制补码加法以及它如何溢出。然后C语言标准由编译器开发人员修改,并且微妙的规则改变,尽管硬件不改变。 C语言规范是开发人员和编译器之间的合同,但协议的条款可能会随时间而改变,并不是每个人都理解每个细节,或者同意细节甚至是明智的。
The problem here is that compilers have, for decades, been less aggressive in optimization, and so generations of C programmers learn and understand things like fixed-size twos complement addition and how it overflows. Then the C language standard is amended by compiler developers, and the subtle rules change, despite the hardware not changing. The C language spec is a contract between the developers and compilers, but the terms of the agreement are subject to change over time and not everyone understands every detail, or agrees that the details are even sensible.
这就是大多数编译器提供标志来关闭(或打开)优化的原因。你的程序写的理解,整数可能溢出?然后你应该关闭溢出优化,因为他们可以引入错误。你的程序严格避免别名指针吗?然后,您可以打开假设指针从不别名的优化。您的程序是否尝试清除内存以避免泄露信息?哦,在这种情况下你运气不好:你需要关闭死代码删除或者你需要提前知道,你的编译器将消除你的死代码,并使用一些工作 - 它的。
This is why most compilers offer flags to turn off (or turn on) optimizations. Is your program written with the understanding that integers might overflow? Then you should turn off overflow optimizations, because they can introduce bugs. Does your program strictly avoid aliasing pointers? Then you can turn on the optimizations that assume pointers are never aliased. Does your program try to clear memory to avoid leaking information? Oh, in that case you're out of luck: you either need to turn off dead-code-removal or you need to know, ahead of time, that your compiler is going to eliminate your "dead" code, and use some work-around for it.
这篇关于编译器优化可以引入bug吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!