问题描述
我记得读书的地方,要真正优化和放大器;加快code的某些部分,程序员编写在汇编语言部分。我的问题是 -
I remember reading somewhere that to really optimize & speed up certain section of the code, programmers write that section in Assembly language. My questions are -
- 是这种做法仍然做了什么?而怎样才能做到这一点?
- 是不是在汇编语言有点过于繁琐和放大器写作;古老?
- 当我们编译C code(带或不带-O3标志),编译器做了一些code优化和放大器;所有链接库和放大器;转换code二进制目标文件。所以,当我们运行程序,它已经在其最基本的形式,即二进制文件。那么,如何诱导汇编语言的帮助?
我试图理解这个概念和放大器;任何帮助或链接是非常AP preciated。
I am trying to understand this concept & any help or links is much appreciated.
更新:改写3点由dbemerlin-的要求,因为你也许可以写出更有效的组装code比编译器生成,但除非你是一个专家汇编的code将propably运行速度较慢,因为通常编译器优化了code比大多数人好就可以。
UPDATE: Rephrasing point 3 as requested by dbemerlin- Because you might be able to write more effective assembly code than the compiler generates but unless you are an assembler expert your code will propably run slower because often the compiler optimizes the code better than most humans can.
推荐答案
这是非常有用的恢复到汇编语言的唯一时间是当
The only time it's useful to revert to assembly language is when
-
在CPU指令不具有功能等同于C ++(例如单指令多数据指令,BCD或十进制运算)
the CPU instructions don't have functional equivalents in C++ (e.g. single-instruction-multiple-data instructions, BCD or decimal arithmetic operations)
- 和编译器不提供额外的功能来包装这些操作(例如C ++ 11标准具有的的包括,
< cstdlib>
拥有的等获取商的和的剩余有效) - 并没有一个很好的第三方库(例如的)
- AND the compiler doesn't provide extra functions to wrap these operations (e.g. C++11 Standard has atomic operations including compare-and-swap,
<cstdlib>
hasdiv
/ldiv
et al for getting quotient and remainder efficiently) - AND there isn't a good third-party library (e.g. http://mitpress.mit.edu/catalog/item/default.asp?tid=3952&ttype=2)
的或的
对于一些无法解释的原因 - 优化器是没有使用最好的CPU指令
for some inexplicable reason - the optimiser is failing to use the best CPU instructions
的 ...和... 的
- 使用这些CPU指令将使一些显著的和有用的性能提升的瓶颈code。
只需用内联汇编做一次手术,可以很容易地pssed在C ++中前$ P $ - 比如增加两个值或在一个字符串搜索 - 正在积极适得其反,因为:
Simply using inline assembly to do an operation that can easily be expressed in C++ - like adding two values or searching in a string - is actively counterproductive, because:
- 编译器知道如何做到这一点同样出色
- 要验证这一点,看看它的汇编输出(例如
GCC -S
)或拆卸机器code
- the compiler knows how to do this equally well
- to verify this, look at its assembly output (e.g.
gcc -S
) or disassemble the machine code
- 编译器优化器可以指定不同的寄存器到它们之间的最小化的复制等效性能的指令之间进行选择,并在这样一种方式,单核可以在一个周期处理多个指令,而通过特定寄存器将连载它迫使everythingt可以选择寄存器
- 在公平,GCC有办法给前preSS需要对特定类型的寄存器没有CPU限制到一个确切的寄存器,仍然允许这样的优化,但它是我见过的,解决这个问题的唯一内嵌汇编
一种观点,我认为的值得铭记的是,当C引入它有笼络了不少铁杆的汇编语言的程序员谁簇拥着机器code产生的。机器有较少的CPU功率和RAM那时,你可以打赌的人簇拥着最微小的事情。优化程序变得非常复杂,并得到持续改善,而像x86处理器的汇编语言变得越来越复杂,因为有自己的执行管道,高速缓存和参与他们的表现等因素。你不能只是从周期每指示任何更多的表中添加值。编译器作者花时间考虑所有这些因素的微妙(尤其是那些CPU制造商合作,但UPS在其它编译器的pressure太)。它现在不切实际的组装程序员平均 - 在任何不平凡的应用 - code比一个好的优化编译器生成的显著更好的效率,而且他们绝大多数有可能的事情变得更糟。因此,使用汇编的应仅限于次那真叫一个可衡量的和有用的区别,值得耦合和维护成本。
One perspective that I think's worth keeping in mind is that when C was introduced it had to win over a lot of hardcore assembly language programmers who fussed over the machine code generated. Machines had less CPU power and RAM back then and you can bet people fussed over the tiniest thing. Optimisers became very sophisticated and have continued to improve, whereas the assembly languages of processors like the x86 have become increasingly complicated, as have their execution pipelines, caches and other factors involved in their performance. You can't just add values from a table of cycles-per-instruction any more. Compiler writers spend time considering all those subtle factors (especially those working for CPU manufacturers, but that ups the pressure on other compilers too). It's now impractical for assembly programmers to average - over any non-trivial application - significantly better efficiency of code than that generated by a good optimising compiler, and they're overwhelmingly likely to do worse. So, use of assembly should be limited to times it really makes a measurable and useful difference, worth the coupling and maintenance costs.
这篇关于在C / C使用汇编语言++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
- to verify this, look at its assembly output (e.g.
- 要验证这一点,看看它的汇编输出(例如