问题描述
我已经使用 Visual Studio C++ 2008 SP1,x64
C++
编译器编译了以下内容:
我很好奇,为什么编译器要在那些 call
之后添加那些 nop
指令?
PS1.我会理解第二个和第三个 nop
将在 4 个字节的边距上对齐代码,但第一个 nop
打破了这个假设.
PS2.编译的 C++ 代码中没有循环或特殊的优化内容:
CTestDlg::CTestDlg(CWnd* pParent/*=NULL*/): CDialog(CTestDlg::IDD, pParent){m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME);//这没有意义.我用它来设置调试器断点:: GdiFlush();srand(::GetTickCount());}
PS3.附加信息: 首先,感谢大家的投入.
以下是其他观察结果:
我的第一个猜测是
- 我尝试使用更新的链接器构建它,尽管
VS 2013
生成的x64
代码看起来有些不同,但它仍然添加了那些nop
s 在一些call
s 之后:
- 此外,
dynamic
与static
链接到 MFC 对那些nop
的存在没有影响.这个是使用VS 2013
动态链接到 MFC dll 构建的:
- 还要注意那些
nop
也可以出现在near
和far
call
之后,并且它们与对齐无关.这是我从IDA
获得的代码的一部分,如果我再进一步:
如您所见,
nop
插入在far
call
之后,恰好对齐"了下一个leaB
地址上的 code> 指令!如果这些只是为了对齐而添加的,那就没有意义了.- 我最初倾向于相信,因为
near
relative
call
s(即那些以E8
开头的) 比far
call
链接器可能会尝试先使用
near
call
,因为它们比far
calls,如果成功,它可能会在末尾用
nop
s 填充剩余空间.但是上面的例子(5)有点推翻了这个假设.所以我仍然没有明确的答案.
解决方案
这纯粹是猜测,但它可能是某种 SEH 优化.我说优化是因为在没有 NOP 的情况下,SEH 似乎也能正常工作.NOP 可能有助于加快平仓速度.
在以下示例(VC2017 现场演示)中,有一个
NOP
在调用basic_string::assign
之后插入test1
但不在test2
(相同但声明为非抛出).#include #include int test1() {std::string s = "a";//NOP 在这里插入s += getchar();返回 (int)s.length();}int test2() throw() {std::string s = "a";s += getchar();返回 (int)s.length();}int main(){返回 test1() + test2();}
组装:
test1:...调用 std::basic_string,std::allocator>::赋值键盘 1 ;没有调用 getchar...测试2:...调用 std::basic_string,std::allocator>::赋值调用 getchar
请注意,默认情况下 MSVS 使用
/EHsc
标志(同步异常处理)进行编译.如果没有那个标志,NOP
就会消失,而使用/EHa
(同步和异步异常处理),throw()
不再有什么不同,因为 SEH 始终处于开启状态.出于某种原因,只有
throw()
似乎减少了代码大小,使用noexcept
使生成的代码更大,甚至召唤更多NOP
.MSVC...I've compiled the following using Visual Studio C++ 2008 SP1,
x64
C++
compiler:I'm curious, why did compiler add those
nop
instructions after thosecall
s?PS1. I would understand that the 2nd and 3rd
nop
s would be to align the code on a 4 byte margin, but the 1stnop
breaks that assumption.PS2. The C++ code that was compiled had no loops or special optimization stuff in it:
CTestDlg::CTestDlg(CWnd* pParent /*=NULL*/) : CDialog(CTestDlg::IDD, pParent) { m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME); //This makes no sense. I used it to set a debugger breakpoint ::GdiFlush(); srand(::GetTickCount()); }
PS3. Additional Info: First off, thank you everyone for your input.
Here's additional observations:
My first guess was that incremental linking could've had something to do with it. But, the
Release
build settings in theVisual Studio
for the project haveincremental linking
off.This seems to affect
x64
builds only. The same code built asx86
(orWin32
) does not have thosenop
s, even though instructions used are very similar:
I tried to build it with a newer linker, and even though the
x64
code produced byVS 2013
looks somewhat different, it still adds thosenop
s after somecall
s:
Also
dynamic
vsstatic
linking to MFC made no difference on presence of thosenop
s. This one is built with dynamical linking to MFC dlls withVS 2013
:
Also note that those
nop
s can appear afternear
andfar
call
s as well, and they have nothing to do with alignment. Here's a part of the code that I got fromIDA
if I step a little bit further on:
As you see, the
nop
is inserted after afar
call
that happens to "align" the nextlea
instruction on theB
address! That makes no sense if those were added for alignment only.I was originally inclined to believe that since
near
relative
call
s (i.e. those that start withE8
) are somewhat faster thanfar
call
s (or the ones that start withFF
,15
in this case)
the linker may try to go with
near
call
s first, and since those are one byte shorter thanfar
call
s, if it succeeds, it may pad the remaining space withnop
s at the end. But then the example (5) above kinda defeats this hypothesis.So I still don't have a clear answer to this.
解决方案
This is purely a guess, but it might be some kind of a SEH optimization. I say optimization because SEH seems to work fine without the NOPs too. NOP might help speed up unwinding.
In the following example (live demo with VC2017), there is a
NOP
inserted after a call tobasic_string::assign
intest1
but not intest2
(identical but declared as non-throwing).#include <stdio.h> #include <string> int test1() { std::string s = "a"; // NOP insterted here s += getchar(); return (int)s.length(); } int test2() throw() { std::string s = "a"; s += getchar(); return (int)s.length(); } int main() { return test1() + test2(); }
Assembly:
test1: . . . call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign npad 1 ; nop call getchar . . . test2: . . . call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::assign call getchar
Note that MSVS compiles by default with the
/EHsc
flag (synchronous exception handling). Without that flag theNOP
s disappear, and with/EHa
(synchronous and asynchronous exception handling),throw()
no longer makes a difference because SEH is always on.For some reason only
throw()
seems to reduce the code size, usingnoexcept
makes the generated code even bigger and summons even moreNOP
s. MSVC...这篇关于为什么64位VC++编译器在函数调用后添加nop指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
- 我尝试使用更新的链接器构建它,尽管