是否有可能从C ++汇编中调用内置函数

本文介绍了是否有可能从C ++汇编中调用内置函数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

考虑下面的汇编代码循环：

#include< iostream> $ b $ #define ADD_LOOP（i，n，v）\ asm volatile（\ movw％1，%% cx;\ movq％2，%% rax;\ movq $ 0，%% rbx;\ 为：;\ addq %% rax，%% rbx ;\ decw %% cx;\ jnz for;\ movq %% rbx，％0;\ ： =x（n）\ ：％cx，％rax，％rbx\ ）; int main（）{ uint16_t iter（10000）; uint64_t num（5）; uint64_t val; ADD_LOOP（iter，num，val） std :: cout<< val<<的std :: ENDL; 返回0; $ b 有可能从内部调用C函数（或它的机器代码输出）如上所述的循环？例如： #include < wmmintrin.h> int main（）{ __m128i x，y; for（int i = 0; i x = __builtin_ia32_aesenc128（x，y）; } 返回0; 谢谢解决方案没有。内建函数不是可以用 call 调用的真实函数。它们在C / C ++中总是内联。例如，如果您希望得到 popcnt -mpopcnt 的目标，或者针对不支持 popcnt 指令，你运气不好。您必须亲自 #ifdef ，并使用 popcnt 或其他指令序列。您正在讨论的函数 __ builtin_ia32_aesenc128 仅仅是，您可以如果使用asm编写，直接使用。如果您编写asm而不是使用C ++内在函数（如 #include< immintrin.h> 表现出色，您需要查看来编写更高效的asm（例如使用％ecx 作为循环计数器，而不是％cx 。您使用16位局部寄存器没有任何好处）。您也可以编写更有效的内联汇总约束，例如 movq %% rbx，％0 是浪费指令。您可以在整个过程中使用％0 而不是显式％rbx 。如果内联asm以mov指令开始或结束于复制到输出/输入操作数的输入/输出操作，通常情况下你做错了。让编译器为你分配寄存器。请参阅标记wiki的问题。更好的是，。具有内在函数的代码通常可以编译x86。请参阅： #include< immintrin.h> ; 并使用 __ m128i _mm_aesenc_si128（__m128i a，__m128i RoundKey）。（在gcc中，它只是 __ builtin_ia32_aesenc128 的包装器，但它使您的代码可以移植到其他x86编译器中。） Considering the following assembly code loop: #include <iostream> #define ADD_LOOP(i, n, v) \ asm volatile ( \ "movw %1, %%cx ;" \ "movq %2, %%rax ;" \ "movq $0, %%rbx ;" \ "for: ;" \ "addq %%rax, %%rbx ;" \ "decw %%cx ;" \ "jnz for ;" \ "movq %%rbx, %0 ;" \ : "=x"(v) \ : "n"(i), "x"(n) \ : "%cx", "%rax", "%rbx" \ ); int main() { uint16_t iter(10000); uint64_t num(5); uint64_t val; ADD_LOOP(iter, num, val) std::cout << val << std::endl; return 0; } Is possible to call a C function (or it's machine code output) from within a loop as specified above? for example: #include <wmmintrin.h> int main() { __m128i x, y; for(int i = 0; i < 10; i++) { x = __builtin_ia32_aesenc128(x, y); } return 0; } Thanks 解决方案 No. Builtin functions aren't real functions that you can call with call. They always inline when used in C / C++. For example, if you want int __builtin_popcount (unsigned int x) to get either a popcnt instruction for targets with -mpopcnt, or a byte-wise lookup table for targets that don't support the popcnt instruction, you are out of luck. You will have to #ifdef yourself and use popcnt or an alternative sequence of instructions. The function you're talking about, __builtin_ia32_aesenc128 is just a wrapper for the aesenc assembly instruction which you can just use directly if writing in asm. If you're writing asm instead of using C++ intrinsics (like #include <immintrin.h> for performance, you need to have a look at http://agner.org/optimize/ to write more efficient asm (e.g. use %ecx as a loop counter, not %cx. You're gaining nothing from using a 16-bit partial register). You could also write more efficient inline-asm constraints, e.g. the movq %%rbx, %0 is a waste of an instruction. You could have used %0 the whole time instead of an explict %rbx. If your inline asm starts or ends with a mov instruction to copy to/from an output/input operand, usually you're doing it wrong. Let the compiler allocate registers for you. See the inline-assembly tag wiki. Or better, https://gcc.gnu.org/wiki/DontUseInlineAsm. Code with intrinsics typically compiles well for x86. See Intel's intrinsics guide: #include <immintrin.h> and use __m128i _mm_aesenc_si128 (__m128i a, __m128i RoundKey). (In gcc that's just a wrapper for __builtin_ia32_aesenc128, but it makes your code portable to other x86 compilers.) 这篇关于是否有可能从C ++汇编中调用内置函数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！