本文介绍了GCC生成的程序集用于ARM上未对齐的浮点访问的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我目前正在开发一个程序,在该程序中我需要处理一个数据blob,该数据blob包含一系列可能不对齐(有时也是不对齐)的浮点数.我正在使用gcc 4.6.2为ARM cortex-a8进行编译.我对生成的汇编代码有疑问:

Hello I am currently working on a program where I need to process a data blob that contains a series of floats which could be unaligned (and also are sometimes).I am compiling with gcc 4.6.2 for an ARM cortex-a8.I have a question to the generated assembly code:

作为示例,我写了一个最小的示例:对于以下测试代码

As example I wrote a minimal example: For the following test code

float aligned[2];
float *unaligned = (float*)(((char*)aligned)+2);

int main(int argc, char **argv)
{
    float f = unaligned[0];
    return (int)f;
}

编译器(gcc 4.6.2-优化-O3)产生

the compiler (gcc 4.6.2 - with optimization -O3) produces

00008634 <main>:
    8634: e30038ec            movw         r3, #2284      ; 0x8ec
    8638: e3403001            movt         r3, #1
    863c: e5933000            ldr          r3, [r3]
    8640: edd37a00            vldr         s15, [r3]
    8644: eefd7ae7            vcvt.s32.f32 s15, s15
    8648: ee170a90            vmov         r0, s15
    864c: e12fff1e            bx           lr

此处的编译器无法知道数据是否对齐,但永远不会少于需要使用对齐数据的VLDR,否则程序会因总线错误而崩溃.

The compiler here cannot know if the data is aligned but never the less it uses VLDR which needs aligned data or the program will crash with a bus error.

现在这是我的实际问题:编译器是否正确,我需要注意C ++代码中的对齐方式,还是这是编译器中的错误?

Now here is my actual question: Is this correct from the compiler and I need to take care of alignment in my C++ code or is this a bug in the compiler?

我还可以添加当前的解决方法,该方法可以正常工作,并在访问值之前使gcc进行复制.诀窍是定义一个仅包含带有gcc打包属性的浮点的结构,并通过结构指针访问数据.代码段:

I also might add my current workaround which works and brings gcc to make a copy before accessing the value. The trick is to define a struct which only contains a float with the gcc packed attribute and access the data via a struct pointer. Code snippet:

struct FloatWrapper { float f; } __attribute__((packed));
const FloatWrapper *x = reinterpret_cast<const FloatWrapper *>(rawX.data());
const FloatWrapper *y = reinterpret_cast<const FloatWrapper *>(rawY.data());

for (size_t i = 0; i < vertexCount; ++i) {
    vertices[i].x = x[i].f;
    vertices[i].y = y[i].f;
}

推荐答案

正如您所指出的, ARM ARM A3.2.1 状态不考虑 SCTLR.A 值, VLDR 生成对齐错误.

As you have pointed ARM ARM A3.2.1 states regardless of SCTLR.A value, VLDR generates Alignment fault.

我已经在Cortex-A9上测试了您的示例,我知道了

I've tested your example on an Cortex-A9 and I got

# float_align
[1] + Stopped (signal)     float_align

但是, ARM Cortex-A8 TRM 4.2.1 ,它指出

如果未指定对齐方式限定词且A = 0,则将其视为未对齐访问权限.

If an alignment qualifier is not specified, and A=0, it is treated as unaligned access.

这可能是一个半生半熟的解释,因为 ARM ARM 通过详细的指令表提供了更多信息.

This is probably a half baked explanation, since ARM ARM is giving more information with a detailed table on instructions.

所以我认为答案是,您需要自己进行对齐,因为编译器无法找到所有情况下要加载的地址,例如链接后地址可能可用等.

So I think answer is, you need to take care of alignment yourself since compiler can't find out which addresses you are loading in all scenarios, like address might be available after linking etc.

这篇关于GCC生成的程序集用于ARM上未对齐的浮点访问的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 19:35
查看更多