I would like to know how to avoid wasting my time and risking typos by re-hashing source code when I'm integrating legacy code, library code or sample code into my own codebase.
If I give a simple example, based on an image processing scenario, you might see what I mean.
It's actually not unusual to find I'm integrating a code snippet like this:
for (unsigned int y = 0; y < uHeight; y++)
for (unsigned int x = 0; x < uWidth; x++)
// do something with this pixel ....
uPixel = pPixels[y * uStride + x];
Over time, I've become accustomed to doing things like moving unnecessary calculations out of the inner loop and maybe changing the postfix increments to prefix ...
for (unsigned int y = 0; y < uHeight; ++y)
unsigned int uRowOffset = y * uStride;
for (unsigned int x = 0; x < uWidth; ++x)
// do something with this pixel ....
uPixel = pPixels[uRowOffset + x];
Or, I might use pointer arithmetic, either by row ...
for (unsigned int y = 0; y < uHeight; ++y)
unsigned char *pRow = pPixels + (y * uStride);
for (unsigned int x = 0; x < uWidth; ++x)
// do something with this pixel ....
uPixel = pRow[x];
... or by row and column ... so I end up with something like this
unsigned char *pRow = pPixels;
for (unsigned int y = 0; y < uHeight; ++y)
unsigned char *pPixel = pRow;
for (unsigned int x = 0; x < uWidth; ++x)
// do something with this pixel ....
uPixel = *pPixel++;
// next row
pRow += uStride;
Now, when I write from scratch, I'll habitually apply my own "optimisations" but I'm aware that the compiler will also be doing things like:
- 将代码从内部循环移动到外部循环
- 将后缀增量更改为前缀
- 很多我不知道的其他内容
Bearing in mind that every time I mess with a piece of working, tested code in this way, I not only cost myself some time but I also run the risk that I'll introduce bugs with finger trouble or whatever (the above examples are simplified). I'm aware of "premature optimisation" and also other ways of improving performance by designing better algorithms, etc. but for the situations above I'm creating building-blocks that will be used in larger pipelined type of apps, where I can't predict what the non-functional requirements might be so I just want the code as fast and tight as is reasonable within time limits (I mean the time I spend tweaking the code).
因此,我的问题是:在哪里可以找到现代编译器通常支持哪些编译器优化。我混合使用了Visual Studio 2008和2012,但想知道替代方法是否存在差异,例如英特尔的C / C ++编译器。任何人都可以发表一些见识和/或将我指向有用的Web链接,书籍或其他参考文献吗?
So, my question is: Where can I find out what compiler optimisations are commonly supported by "modern" compilers. I'm using a mixture of Visual Studio 2008 and 2012, but would be interested to know if there are differences with alternatives e.g. Intel's C/C++ Compiler. Can anyone shed some insight and/or point me at a useful web link, book or other reference?
Just to clarify my question
- 我在上面显示的优化只是简单的例子,而不是完整的列表。我知道(从性能的角度来看)进行这些特定的更改是没有意义的,因为编译器还是会这样做。
- 我特别在寻找有关优化的信息。由我正在使用的编译器提供。
- The optimisations I showed above were simple examples, not a complete list. I know that it's pointless (from a performance point of view) to make those specific changes because the compiler will do it anyway.
- I'm specifically looking for information about what optimisations are provided by the compilers I'm using.
I would expect most of the optimizations that you include as examples to be a waste of time. A good optimizing compiler should be able to do all of this for you.
I can offer three suggestions by way of practical advice:
- 在处理实际数据的真实应用程序的上下文中配置代码。如果不能,则提出一些您认为会紧密模拟最终系统的综合测试。
- 仅优化通过性能分析证明的代码瓶颈。 >
- 如果您确信一段代码需要优化,则不要仅仅假设将不变表达式分解成循环就可以提高性能。始终进行基准测试,可以选择查看生成的程序集以获得进一步的了解。
- Profile your code in the context of a real application processing real data. If you can't, come up with some synthetic tests that you think would closely mimic the final system.
- Only optimize code that you have demonstrated through profiling to be a bottleneck.
- If you are convinced that a piece of code needs optimization, don't just assume that factoring invariant expression out of a loop would improve performance. Always benchmark, optionally looking at the generated assembly to gain further insight.
The above advice applies to any optimizations. However, the last point is particularly relevant to low-level optimizations. They are a bit of a black art since there are a lot of relevant architectural details involved: memory hierarchy and bandwidth, instruction pipelining, branch prediction, the use of SIMD instructions etc.
I think it's better to rely on the compiler writer having a good knowledge of the target architecture than to try and outsmart them.
From time to time you will find through profiling that you need to optimize things by hand. However, these instances will be fairly rare, which will allow you to spend a good deal of energy on things that will actually make a difference.
In the meantime, focus on writing correct and maintainable code.
这篇关于关于C / C ++编译器优化,我可以假设什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!