问题描述
如何为每个数组元素添加一个常量?
请参见下面的代码段:
How do I add a constant to each array elements?
See code snippet below:
#include "stdafx.h"
#include <iostream>
#include <iomanip>
#include "xmmintrin.h"
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
int x[4][4]={1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4}; //source or input
int y[4][4];//destination or output
int b[4] ={5,5,5,5};// constant
int f0,f1,f2,f3;
__asm
{
mov eax,x
mov ecx,b
mov edi,y
//stage one
movq mm0,[eax]
movq mm7,[ecx]
paddw mm0,mm7
movq [f0],mm0
movq mm1,[eax+8]
paddw mm1,mm7
movq [f1],mm1
movq mm2,[eax+16]
paddw mm2,mm7
movq [f2],mm2
movq mm3,[eax+24]
paddw mm3,mm7
movq [f3],mm3
//stage two
movq mm0,[f0]
psraw mm0,6
movq [edi],mm0
movq mm1,[f1]
psraw mm1,6
movq [edi+8],mm1
movq mm2,[f2]
psraw mm2,6
movq [edi+16],mm2
movq mm3,[f3]
psraw mm3,6
movq [edi+24],mm3
}
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
cout << y[i][j] << " ";
cout << endl;
}
return 0;
}
推荐答案
int _tmain(int argc, _TCHAR* argv[])
{
int x[4][4]={1,2,3,4,2,2,2,2,3,3,3,3,4,4,4,4}; //source or input
int y[4][4];//destination or output
int b[4] ={5,5,5,5}; // constant
//int f0,f1,f2,f3;
__asm
{
push esi
push edi
push ebx
lea esi, x
lea edi, y
lea edx, b
//stage one
mov eax, 4
mov ebx, eax ; mov ebx, 1st dim size
_loop_00:
mov ecx, eax ; mov ecx, 2nd dim size
_loop_01:
movd mm0, [edx]
movd mm1, [esi]
paddd mm0, mm1
; psraw mm0, 6 ; // stage two
movd [edi], mm0
add esi, eax
add edi, eax
loop _loop_01
add edx, eax
dec ebx
jnz _loop_00
//stage two
// ???
pop ebx
pop edi
pop esi
}
_mm_empty();
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
cout << y[i][j] << " ";
cout << endl;
}
return 0;
}
我不明白第二阶段.我将其注释掉是因为它产生的所有0都具有如此小的值. :-)如果需要进行算术移位,则可以取消注释行; psraw mm0,6".
您还可以通过ebx和ecx更改数组大小.我已使用eax寄存器,因为所有常量值均等于4,以进行优化(在这种情况下).
顺便说一句,我已经使用MMX代表您问题的标题.在不使用MMX指令集的情况下,将有一些更优化的方式(仅使用x86).因为数组元素是"int"(即32位),所以我认为使用64位MMX寄存器不会带来任何收益.
I didn''t understand stage two. I commented it out because it yields all 0 with such small values. :-) If you need to do arithmetic shift, you can un-comment line "; psraw mm0, 6".
You can also change array sizes by ebx and ecx. I''ve used eax register as all constant values equal 4 for optimization (in this particular case).
BTW, I''ve used MMX on behalf of your question''s title. There would be some more optimized (using only x86) way of doing this without using MMX instruction set. Because array elements are "int" (that is, 32 bits), I think, there is no gain by using 64 bits MMX registers.
这篇关于向MMX内联汇编中的数组元素添加常量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!