问题描述
循环展开是一种常见的优化方法,但是反向操作也可以完成吗?
(以减小目标文件输出的大小,使用较小的二进制文件).
Loop unrolling is a common optimization, but is the reverse done too?
(to reduce size of the object file output, a smaller binary).
我很好奇这是编译器将重复的相同代码块(或函数调用)重复删除到循环中还是将重复的块提取到静态函数中的通用技术.
I'm curious if it's a common technique for compilers to de-duplicate successive, identical blocks of code (or function calls) into a loop, or extract a duplicate block into a static function.
我很感兴趣,因为C中只有标头库*
可以添加很多重复的代码,因此了解某些C编译器是否能够检测到并更有效地处理它会很有用.
I'm interested because there are header-only libraries*
in C which can add a lot of duplicate code, so it would be useful to know if some C compilers are able to detect this and handle it more efficiently.
*
仅标头库"是指直接定义代码而不是函数定义的标头.
如果这样做,了解在什么条件下会很有用.施加约束以确保可以使用它.
And if this is done, it would be useful to know under what conditions & constraints apply to ensure it can be made use of.
注意(出于这个问题的目的-任何流行的C编译器都可以使用GCC/Clang/Intel/MSVC).
Note (For the purpose of the question - any popular C compiler is fine GCC/Clang/Intel/MSVC).
我发现的仅标头库 uthash 使用了一些非常大的宏,我想知道是否发生了一些编译器欺骗活动,从而可以巧妙地重复删除如此庞大的代码块,例如: uthash.h ,另一个类似的例子是内联qsort.h
A header-only library I found, called uthash uses some very large macros, and I wanted to know if there was some compiler trickery going on that could cleverly de-duplicate such huge blocks of code, see: eg uthash.h, another similar example is inline qsort.h
可以重复数据删除的块的示例(事实证明Py_DECREF
可以扩展为相当大的代码块).
Example of a block that could be de-duplicated (it turns out Py_DECREF
can expand into a fairly large block of code).
#define PY_ADD_TO_DICT(dict, i) \
do {
PyDict_SetItemString(dict, names[i], item = PyUnicode_FromString(values[i])); \
Py_DECREF(item); \
} while (0)
/* this could be made into a loop */
PY_ADD_TO_DICT(d, 0);
PY_ADD_TO_DICT(d, 1);
PY_ADD_TO_DICT(d, 2);
PY_ADD_TO_DICT(d, 3);
注意,这是人为设计的,但基于真实示例.
似乎我的问题的简短答案是否(或仅在某些有限/琐碎的情况下),只是为了阐明我为什么要再问一遍.
It seems the short answer to my question is no (or only in some limited / trivial cases), just to clarify why I was asking a bit further.
对注释的某些答复似乎假设您只是将代码重构为函数的任何人.
当然,这总是始终的最佳选择,尽管如此,有时还是会出现非常相似的代码块.
Some replies in comments seem to assume anyone you would just refactor code into a function.
This is almost-always the best option of course, nevertheless there are times when blocks of very similar code can show up.
- 不是很聪明的代码生成器创建的样板代码.
- 使用外部API将其某些功能公开为宏时(在这种情况下,当然,在大多数情况下,在函数中本地包装是可行的,但这意味着您的代码会有自己的怪癖,这对于使用而言并不常见)的那些API).
- 当您无法用函数替换宏时,在极少数情况下,这样做最终是不切实际的.
- 从外部代码库导入代码时,进入并开始清理其代码并不总是理想的,评估该代码库对了解编译器在优化代码方面的聪明程度很有用.
- boiler plate code created by a not-very-smart code generator.
- when using an external API which exposes some of its functionality as macros (in this case wrapping locally in functions works in most cases of course, but this means your code gets its own quirks which aren't typical for uses of those API's).
- when you can't replace macros with functions, there are some rare cases where it just ends up being impractical to do this.
- when importing code from an external code-base, its not always ideal to go in and start cleaning up their code, when evaluating that code-base its useful to have an idea how smart the compiler will be at optimizing the code.
在所有这些情况下,都可以对进行重复数据删除(生成更智能的代码,将宏包装在函数中,修补第3方库),但是在尝试进行此类操作之前,有必要了解如何编译器正在为我们做很多工作.
In all these cases its possible to de-duplicate (generate smarter code, wrap macros in functions, patch 3rd party libraries), but before going making an effort to do such things its worth knowing how much work the compiler is doing for us.
推荐答案
根据工具链的不同,您可能可以选择指导编译器和链接器识别和合并冗余代码.一些不错的Google关键字包括:
Depending on the toolchain, you may have options to coach the compiler and linker into recognizing and coalescing redundant code. Some good google keywords include:
- "code factoring" and additional keywords
- "whole program optimization"
- "interprocedural optimization" Wikipedia
- "link time optimization" LTO
请注意,前面的评论中提到的 gcc优化页面提供了一些感兴趣的标志,即:
Note that the gcc optimizations page mentioned in previous comments provides some flags of interest, namely:
- -ftree-tail-merge
- -ftree-switch-conversion
- -fgcse
- -fcrossjumping
- -fipa-pta
- -fipa-icf (相同的代码折叠),已添加到GCC5.x
- -fopa-cp
- -flto
- -整个程序
- -ftree-tail-merge
- -ftree-switch-conversion
- -fgcse
- -fcrossjumping
- -fipa-pta
- -fipa-icf (identical code folding), added in GCC5.x
- -fopa-cp
- -flto
- -fwhole-program
最后,这些博客文章提供了很多信息:
Finally, these blog posts are informative:
- http://hubicka.blogspot. com/2014/04/linktime-optimization-in-gcc-1-brief.html
- http://hubicka.blogspot. com/2014/04/linktime-optimization-in-gcc-2-firefox.html
- http://hubicka.blogspot.com/2014/04/linktime-optimization-in-gcc-1-brief.html
- http://hubicka.blogspot.com/2014/04/linktime-optimization-in-gcc-2-firefox.html
这篇关于C编译器会重复删除(合并)代码吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!