问题描述
我正在尝试获得[0 ... n-1]
范围内的模n(0
idiv n
指令将EDX:EAX(64位)除以n和树叶
EAX =商,EDX =余数. (n
是我的代码中的寄存器)
我的问题是,当EDX:EAX的内容为负数时,我在EDX中得到的结果为负数.
我找到的最简单的解决方案是:
cdq ; extend EAX sign bit to EDX
idiv n ; edx = (possibly neg.) remainder
add edx, n
mov eax, edx ; eax = remainder + n
cdq
idiv n ; edx = positive remainder
是否有更清洁/更轻松/更快捷的方法来获得正余数?
产生非负模数的除法称为欧几里得分裂.
有关C实现和更多背景信息,请参见计算机科学家的部门和模量. (也相关:"mod"和"remainder有什么区别"?).
这是一个特例,我们知道除数为正,这允许Ped7g建议更快的实现.它可能是最佳的,或者至少接近它.
有趣的是,我们可以用C编写与Ped7g手写版本相同的asm(或接近),具体取决于gcc与clang的方式.看到它在Godbolt编译器浏览器上.
// https://stackoverflow.com/questions/40861023/how-do-i-get-a-positive-modulo-on-a-negative-dividend
// uses unsigned carry to detect when a negative 2's complement integer wrapped to non-negative
long modE_positive_divisor_2s_complement( long D, long d )
{
long r = D%d;
unsigned long tmp = (unsigned long)r + (unsigned long)d;
// detect carry by looking for unsigned wraparound:
// that means remainder was negative (not zero), so adding the (non-negative) divisor is what we need
r = (tmp < (unsigned long)r) ? (long)tmp : r;
return r;
}
使用clang3.9 -O3,我们得到:
mov rax, rdi
cqo
idiv rsi
add rsi, rdx
cmovae rsi, rdx
mov rax, rsi
ret
gcc6.2在CMOV之前使用了额外的CMP:(
在Godbolt上,我包括了原始的通用C函数以及一个告诉编译器采用正数d
(使用__builtin_unreachable()
)的版本.对于后者,gcc使用test rdx,rdx
/cmovs
进行条件加.
对于32位和64位long
(使用EAX代替RAX等),它的工作原理相同,因此,如果您关心性能并且不需要64位整数,请使用int32_t
. (idiv r64
比idiv r32
慢得多.)
I am trying to get a modulo-n (0 < n) in the range [0 ... n-1]
.
The instruction idiv n
divides EDX:EAX (64 bits) by n, and leaves
EAX=quotient, EDX=remainder. (n
is a register in my code)
My problem is that when the content of EDX:EAX is negative, I get a negative result in EDX.
The simplest solution I found was:
cdq ; extend EAX sign bit to EDX
idiv n ; edx = (possibly neg.) remainder
add edx, n
mov eax, edx ; eax = remainder + n
cdq
idiv n ; edx = positive remainder
Is there a cleaner/easier/quicker way to get the positive remainder?
The kind of division that produces a non-negative modulo is called Euclidean division.
For C implementations and more background, see Division and Modulus for Computer Scientists. (also related: What's the difference between "mod" and "remainder"?).
This is a special case, where we know the divisor is positive, which allows the faster implementation suggested by Ped7g. It's probably optimal or at least close to it.
Interestingly, we can write it in C in a way that compiles to the same asm as Ped7g's hand-written version (or close to it, depending on gcc vs. clang). See it on the Godbolt compiler explorer.
// https://stackoverflow.com/questions/40861023/how-do-i-get-a-positive-modulo-on-a-negative-dividend
// uses unsigned carry to detect when a negative 2's complement integer wrapped to non-negative
long modE_positive_divisor_2s_complement( long D, long d )
{
long r = D%d;
unsigned long tmp = (unsigned long)r + (unsigned long)d;
// detect carry by looking for unsigned wraparound:
// that means remainder was negative (not zero), so adding the (non-negative) divisor is what we need
r = (tmp < (unsigned long)r) ? (long)tmp : r;
return r;
}
With clang3.9 -O3, we get:
mov rax, rdi
cqo
idiv rsi
add rsi, rdx
cmovae rsi, rdx
mov rax, rsi
ret
gcc6.2 uses an extra CMP before the CMOV :(
On Godbolt, I included the original general-case C function and a version that tells the compiler to assume positive d
(using __builtin_unreachable()
). For the latter, gcc does about as well as with this, using a test rdx,rdx
/ cmovs
to do the conditional add.
It works identically for 32-bit vs. 64-bit long
(with EAX instead of RAX, etc.), so use int32_t
if you care about performance and don't need 64-bit integers. (idiv r64
is much slower than idiv r32
).
这篇关于如何获得负股利的正模的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!