GCC 发出 ARM idiv 指令

本文介绍了GCC 发出 ARM idiv 指令的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何指示 gcc 为 idiv(整数除法，udiv 和 sdiv)发出指令code>arm 应用处理器?

How can I instruct gcc to emit idiv (integer division, udiv and sdiv) instructions for arm application processors?

到目前为止，我能想到的唯一方法是在 gcc 4.7 中使用 -mcpu=cortex-a15.

So far only way I can come up with is to use -mcpu=cortex-a15 with gcc 4.7.

$cat idiv.c
int test_idiv(int a, int b) {
    return a / b;
}

在 gcc 4.7 上(与 Android NDK r8e 捆绑在一起)>

On gcc 4.7 (bundled with Android NDK r8e)

$gcc -O2 -mcpu=cortex-a15 -c idiv.c
$objdump -S idiv.o

00000000 <test_idiv>:
   0:   e710f110    sdiv    r0, r0, r1
   4:   e12fff1e    bx  lr

即使这个给 idiv.c:1:0: 警告:开关 -mcpu=cortex-a15 与 -march=armv7-a 开关冲突 [默认启用] 如果你添加 -mcpu=cortex-a15 旁边的 >-march=armv7-a 并且不发出 idiv 指令.

Even this one gives idiv.c:1:0: warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch [enabled by default] if you add -march=armv7-a next to -mcpu=cortex-a15 and doesn't emit idiv instruction.

$gcc -O2 -mcpu=cortex-a15 -march=armv7-a -c idiv.c

idiv.c:1:0: warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch [enabled by default]

$objdump -S idiv.o
00000000 <test_idiv>:
   0:   e92d4008    push    {r3, lr}
   4:   ebfffffe    bl  0 <__aeabi_idiv>
   8:   e8bd8008    pop {r3, pc}

在 gcc 4.6(与 Android NDK r8e 捆绑)上，它根本不发出 idiv 指令，但识别 -mcpu=cortex-a15 也不会抱怨-mcpu=cortex-a15 -march=armv7-a 组合.

On gcc 4.6 (bundled with Android NDK r8e) it doesn't emit idiv instructions at all but recognizes -mcpu=cortex-a15 also doesn't complain to -mcpu=cortex-a15 -march=armv7-a combination.

Afaik idiv 在 armv7 上是可选的，所以应该有一种更简洁的方法来指示 gcc 发出它们，但如何发出?

Afaik idiv is optional on armv7, so there should be a cleaner way to instruct gcc to emit them but how?

推荐答案

如果指令不在 machine descriptions 中，那么我怀疑 gcc 会发出代码.

If the instruction is not in the machine descriptions, then I doubt that gcc will emit code.

如果编译器不支持，你总是可以使用 inline-assembler 来获取指令. 因为你的 op-code 是相当罕见/特定于机器，在 gcc 源中获取它可能没有太多努力.特别是，有 arch 和 tune/cpu 标志.tune/cpu 适用于更具体的机器，但 arch 假设允许该架构中的所有机器.如果我理解，这个 op-code 似乎打破了这条规则.

You can always use inline-assembler to get the instruction if the compiler is not supporting it. Since your op-code is fairly rare/machine specific, there is probably not so much effort to get it in the gcc source. Especially, there are arch and tune/cpu flags. The tune/cpu is for a more specific machine, but the arch is suppose to allow all machines in that architecture. This op-code seems to break that rule, if I understand.

对于 gcc 4.6.2，看起来 thumb2 和 cortex-r4 是使用这些指令的提示，正如您所指出的gcc 4.7.2，好像添加了cortex-a15 来使用这些指令.在 gcc 4.7.2 中，thumb2.md 文件不再有 udiv/sdiv.但是，它可能包含在其他地方；我不是 100% 熟悉所有机器描述语言.cortex-a7、cortex-a15 和 cortex-r5 似乎也可以在 4.7.2 中启用这些指令.

For gcc 4.6.2, it looks like thumb2 and cortex-r4 are cues to use these instructions and as you have noted with gcc 4.7.2, the cortex-a15 seems to be added to use these instructions. With gcc 4.7.2, the thumb2.md file no longer has udiv/sdiv. However, it might be included somewhere else; I am not 100% familiar with all the machine description language. It also seems that cortex-a7, cortex-a15, and cortex-r5 may enable these instructions with 4.7.2.

这并没有直接回答问题，但它确实提供了一些信息/路径来获得答案.您可以使用 -mcpu=cortex-r4 编译模块，尽管这可能会产生链接器问题.此外，还有 int my_idiv(int a, int b) __attribute__ ((__target__ ("arch=cortexe-r4")));，您可以在其中指定每个函数的 machine-description 由代码生成器使用.我自己没有使用过这些，但它们只是尝试的可能性.通常，您不想保留错误的机器，因为它可能会生成次优(并且可能是非法的)操作码.您必须进行实验，然后才能提供真实的答案.

This doesn't answer the question directly, but it does give some information/path to get the answer. You can compile the module with -mcpu=cortex-r4, although this may produce linker issues. Also, there is int my_idiv(int a, int b) __attribute__ ((__target__ ("arch=cortexe-r4")));, where you can specify on a per-function basis the machine-description used by the code generator. I haven't used any of these myself, but they are only possibilities to try. Generally you don't want to keep the wrong machine as it could generate sub-optimal (and possibly illegal) op-codes. You will have to experiment and maybe then provide the real answer.

注意 1:这是用于 stock gcc 4.6.2 和 4.7.2.不知道你的安卓编译器有没有补丁.

Note1: This is for a stock gcc 4.6.2 and 4.7.2. I don't know if your Android compiler has patches.

gcc-4.6.2/gcc/config/arm$ grep [ius]div *.md
arm.md: "...,sdiv,udiv,other"
cortex-r4.md:;; We guess that division of A/B using sdiv or udiv, on average,
cortex-r4.md:;; This gives a latency of nine for udiv and ten for sdiv.
cortex-r4.md:(define_insn_reservation "cortex_r4_udiv" 9
cortex-r4.md:       (eq_attr "insn" "udiv"))
cortex-r4.md:(define_insn_reservation "cortex_r4_sdiv" 10
cortex-r4.md:       (eq_attr "insn" "sdiv"))
thumb2.md:  "sdiv%?\t%0, %1, %2"
thumb2.md:   (set_attr "insn" "sdiv")]
thumb2.md:(define_insn "udivsi3"
thumb2.md:      (udiv:SI (match_operand:SI 1 "s_register_operand"  "r")
thumb2.md:  "udiv%?\t%0, %1, %2"
thumb2.md:   (set_attr "insn" "udiv")]

gcc-4.7.2/gcc/config/arm$ grep -i [ius]div *.md
arm.md:  "...,sdiv,udiv,other"
arm.md:  "TARGET_IDIV"
arm.md:  "sdiv%?\t%0, %1, %2"
arm.md:   (set_attr "insn" "sdiv")]
arm.md:(define_insn "udivsi3"
arm.md: (udiv:SI (match_operand:SI 1 "s_register_operand"  "r")
arm.md:  "TARGET_IDIV"
arm.md:  "udiv%?\t%0, %1, %2"
arm.md:   (set_attr "insn" "udiv")]
cortex-a15.md:(define_insn_reservation "cortex_a15_udiv" 9
cortex-a15.md:       (eq_attr "insn" "udiv"))
cortex-a15.md:(define_insn_reservation "cortex_a15_sdiv" 10
cortex-a15.md:       (eq_attr "insn" "sdiv"))
cortex-r4.md:;; We guess that division of A/B using sdiv or udiv, on average,
cortex-r4.md:;; This gives a latency of nine for udiv and ten for sdiv.
cortex-r4.md:(define_insn_reservation "cortex_r4_udiv" 9
cortex-r4.md:       (eq_attr "insn" "udiv"))
cortex-r4.md:(define_insn_reservation "cortex_r4_sdiv" 10
cortex-r4.md:       (eq_attr "insn" "sdiv"))

注意 2: 参见预处理器作为汇编器如果 gcc 将选项传递给 gas 以防止使用 udiv/sdiv 指令.例如，您可以使用 asm(" .long \n"); 其中 opcode 是一些标记粘贴字符串化寄存器编码宏输出.此外，您可以注释您的汇编程序以指定 machine 中的更改.所以你可以暂时撒谎并说你有一个cortex-r4等

Note2: See pre-processor as Assembler if gcc is passing options to gas that prevent use of the udiv/sdiv instructions. For example, you can use asm(" .long <opcode>\n"); where opcode is some token pasted stringified register encode macro output. Also, you can annotate your assembler to specify changes in the machine. So you can temporarily lie and say you have a cortex-r4, etc.

注意 3:

gcc-4.7.2/gcc/config/arm$ grep -E 'TARGET_IDIV|arm_arch_arm_hwdiv|FL_ARM_DIV' *
arm.c:#define FL_ARM_DIV    (1 << 23)         /* Hardware divide (ARM mode).  */
arm.c:int arm_arch_arm_hwdiv;
arm.c:  arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
arm-cores.def:ARM_CORE("cortex-a7",  cortexa7,  7A, ... FL_ARM_DIV
arm-cores.def:ARM_CORE("cortex-a15", cortexa15, 7A, ... FL_ARM_DIV
arm-cores.def:ARM_CORE("cortex-r5",  cortexr5,  7R, ... FL_ARM_DIV
arm.h:  if (TARGET_IDIV)                                \
arm.h:#define TARGET_IDIV               ((TARGET_ARM && arm_arch_arm_hwdiv) \
arm.h:extern int arm_arch_arm_hwdiv;
arm.md:  "TARGET_IDIV"
arm.md:  "TARGET_IDIV"

这篇关于GCC 发出 ARM idiv 指令的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！