问题描述
我正在做一些有关使函数内联的速度优势的研究.我没有本书,但是我正在阅读的一本书暗示进行函数调用会产生相当大的开销.并且当可执行文件的大小可以忽略不计或可以保留时,为了快速起见,应内联声明一个函数.
I'm doing a bit of hands on research surrounding the speed benefits of making a function inline. I don't have the book with me, but one text I was reading, was suggesting a fairly large overhead cost to making function calls; and when ever executable size is either negligible, or can be spared, a function should be declared inline, for speed.
我已经编写了以下代码来验证这一理论,并且据我所知,将函数声明为内联没有任何好处.在我的计算机上,这两个函数被调用4294967295次后,都将在196秒内执行.
I've written the following code to test this theory, and from what I can tell, there is no speed benifit from declaring a function as inline. Both functions, when called 4294967295 times, on my computer, execute in 196 seconds.
我的问题是,您对此有何看法?它是现代的编译器优化吗?函数中是否会缺少大量计算?
My question is, what would be your thoughts as to why this is happening? Is it modern compiler optimization? Would it be the lack of large calculations taking place in the function?
任何对此事的见识将不胜感激.在此先感谢朋友.
Any insight on the matter would be appreciated. Thanks in advance friends.
#include < iostream >
#include < time.h >
// RESEARCH Jared Thomson 2010
////////////////////////////////////////////////////////////////////////////////
// Two functions that preform an identacle arbitrary floating point calculation
// one function is inline, the other is not.
double test(double a, double b, double c);
double inlineTest(double a, double b, double c);
double test(double a, double b, double c){
a = (3.1415 / 1.2345) / 4 + 5;
b = 9.999 / a + (a * a);
c = a *=b;
return c;
}
inline
double inlineTest(double a, double b, double c){
a = (3.1415 / 1.2345) / 4 + 5;
b = 9.999 / a + (a * a);
c = a *=b;
return c;
}
// ENTRY POINT Jared Thomson 2010
////////////////////////////////////////////////////////////////////////////////
int main(){
const unsigned int maxUINT = -1;
clock_t start = clock();
//============================ NON-INLINE TEST ===============================//
for(unsigned int i = 0; i < maxUINT; ++i)
test(1.1,2.2,3.3);
clock_t end = clock();
std::cout << maxUINT << " calls to non inline function took "
<< (end - start)/CLOCKS_PER_SEC << " seconds.\n";
start = clock();
//============================ INLINE TEST ===================================//
for(unsigned int i = 0; i < maxUINT; ++i)
test(1.1,2.2,3.3);
end = clock();
std::cout << maxUINT << " calls to inline function took "
<< (end - start)/CLOCKS_PER_SEC << " seconds.\n";
getchar(); // Wait for input.
return 0;
} // Main.
装配体输出
推荐答案
inline
关键字基本上是无用的.这只是一个建议.编译器可以随意忽略它并拒绝内联这样的函数,也可以内联一个没有inline
关键字声明的函数.
The inline
keyword is basically useless. It is a suggestion only. The compiler is free to ignore it and refuse to inline such a function, and it is also free to inline a function declared without the inline
keyword.
如果您真的想对函数调用开销进行测试,则应检查生成的程序集,以确保确实(或未)内联了该函数.我对VC ++并不是很熟悉,但是它可能具有强制或禁止函数内联的特定于编译器的方法(但是标准C ++ inline
关键字将不是这样).
If you are really interested in doing a test of function call overhead, you should check the resultant assembly to ensure that the function really was (or wasn't) inlined. I'm not intimately familiar with VC++, but it may have a compiler-specific method of forcing or prohibiting the inlining of a function (however the standard C++ inline
keyword will not be it).
因此,我想对您的调查的更大范围的回答是:不用担心显式内联.现代编译器知道何时内联和何时不内联,并且通常比有经验的程序员会做出更好的决策.这就是为什么inline
关键字经常被完全忽略的原因.除非有非常特殊的需要,否则您不必担心显式强制或禁止函数的内联(由于对程序的执行进行了性能分析并发现可以通过强制编译器对某些程序执行内联来解决瓶颈,因此无需担心)原因未完成).
So I suppose the answer to the larger context of your investigation is: don't worry about explicit inlining. Modern compilers know when to inline and when not to, and will generally make better decisions about it than even very experienced programmers. That's why the inline
keyword is often entirely ignored. You should not worry about explicitly forcing or prohibiting inlining of a function unless you have a very specific need to do so (as a result of profiling your program's execution and finding that a bottleneck could be solved by forcing an inline that the compiler has for some reason not done).
Re:组装:
; 30 : const unsigned int maxUINT = -1;
; 31 : clock_t start = clock();
mov esi, DWORD PTR __imp__clock
push edi
call esi
mov edi, eax
; 32 :
; 33 : //============================ NON-INLINE TEST ===============================//
; 34 : for(unsigned int i = 0; i < maxUINT; ++i)
; 35 : blank(1.1,2.2,3.3);
; 36 :
; 37 : clock_t end = clock();
call esi
该程序集为:
- 读时钟
- 存储时钟值
- 再次阅读时钟
请注意缺少的内容:多次调用函数
编译器已经注意到您对函数的结果不做任何事情,并且该函数没有副作用,因此根本不会被称为 .
The compiler has noticed that you don't do anything with the result of the function and that the function has no side-effects, so it is not being called at all.
您可以通过关闭优化功能(在调试模式下)进行编译,以使其始终调用该函数.
You can likely get it to call the function anyway by compiling with optimizations off (in debug mode).
这篇关于内联速度和编译器优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!