问题描述
在C中编写饱和加法的最佳(最干净,最有效)方法是什么?
What is the best (cleanest, most efficient) way to write saturating addition in C?
函数或宏应添加两个无符号输入(都需要16-和32位版本),如果总和溢出,则返回全1(0xFFFF或0xFFFFFFFF)。
The function or macro should add two unsigned inputs (need both 16- and 32-bit versions) and return all-bits-one (0xFFFF or 0xFFFFFFFF) if the sum overflows.
目标是x86,ARM使用gcc(4.1.2)和Visual Studio(仅用于模拟,因此可以进行后备实现)。
Target is x86 and ARM using gcc (4.1.2) and Visual Studio (for simulation only, so a fallback implementation is OK there).
推荐答案
您可能希望使用可移植的 C
代码,您的编译器会将其转换为正确的ARM汇编。 ARM有条件移动,而这些可能是有条件溢出的。然后,该算法将成为add,并在检测到溢出时有条件地将目标设置为unsigned(-1)。
You probably want portable C
code here, which your compiler will turn into proper ARM assembly. ARM has conditional moves, and these can be conditional on overflow. The algorithm then becomes add, and conditionally set the destination to unsigned(-1) if overflow was detected.
uint16_t add16(uint16_t a, uint16_t b)
{
uint16_t c = a + b;
if (c<a) /* Can only happen due to overflow */
c = -1;
return c;
}
请注意,这与其他算法的不同之处在于它纠正了溢出,而不是
Note that this differs from the other algorithms in that it corrects overflow, instead of relying on another calculation to detect overflow.
:明显优于其他任何输出答案:
x86-64 clang 3.7 -O3 output for adds32: significantly better than any other answer:
add edi, esi
mov eax, -1
cmovae eax, edi
ret
:
ARMv7: gcc 4.8 -O3 -mcpu=cortex-a15 -fverbose-asm
output for adds32:
adds r0, r0, r1 @ c, a, b
it cs
movcs r0, #-1 @ conditional-move
bx lr
16位:仍不使用ARM的无符号饱和加法指令( UADD16
)
16bit: still doesn't use ARM's unsigned-saturating add instruction (UADD16
)
add r1, r1, r0 @ tmp114, a
movw r3, #65535 @ tmp116,
uxth r1, r1 @ c, tmp114
cmp r0, r1 @ a, c
ite ls @
movls r0, r1 @,, c
movhi r0, r3 @,, tmp116
bx lr @
这篇关于在C中如何做无符号饱和加法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!