本文介绍了如何用GNU GAS或LLVM汇编ARM SVE指令并在QEMU上运行它?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想玩新的使用开放源代码工具的ARM SVE指令.

首先,我想整理一下以下示例: https://developer.arm.com/docs/dui0965/latest/getting-started-with-the-sve-compiler/assembling-sve-code

As a start, I would like to assemble the minimal example present at: https://developer.arm.com/docs/dui0965/latest/getting-started-with-the-sve-compiler/assembling-sve-code

// example1.s
    .global main
main:
    mov     x0, 0x90000000
    mov     x8, xzr
    ptrue   p0.s                        //SVE instruction
    fcpy    z0.s, p0/m, #5.00000000     //SVE instruction
    orr     w10, wzr, #0x400
loop:
    st1w    z0.s, p0, [x0, x8, lsl #2]  //SVE instruction
    incw    x8                          //SVE instruction
    whilelt p0.s, x8, x10               //SVE instruction
    b.any   loop                        //SVE instruction
    mov     w0, wzr
    ret

但是,当我在Ubuntu 16.04上尝试该操作时:

However, when I try that on my Ubuntu 16.04:

sudo apt-get install binutils-aarch64-linux-gnu
aarch64-linux-gnu-as example1.S

它不识别任何SVE组装说明,例如:

it does not recognize any of the SVE assembly instructions, e.g.:

example1.S:6: Error: unknown mnemonic `ptrue' -- `ptrue p0.s'

我认为这是因为我的GNU AS 2.26.1太旧了,还没有SVE支持.

I think this is because my GNU AS 2.26.1 is too old and does not have SVE support yet.

我也可以使用LLVM或任何其他开源汇编程序.

I'm also fine using LLVM or any other open source assembler.

一旦我设法进行组装,我便想在QEMU用户模式下运行它,因为 3.0. 0支持SVE .

Once I manage to assemble, I then want to run it on QEMU user mode since 3.0.0 has SVE support.

推荐答案

带有断言的自动化示例

  • usage
  • source

下面我描述了该示例是如何实现的.

Below I described how that example was achieved.

组装

Ubuntu 18.04中的aarch64-linux-gnu-as 2.30对于SVE已经足够新了,可以从以下位置看到: https://sourceware.org/binutils/docs-2.30/as/AArch64-Extensions.html#AArch64-Extensions

The aarch64-linux-gnu-as 2.30 in Ubuntu 18.04 is already new enough for SVE as can be seen from: https://sourceware.org/binutils/docs-2.30/as/AArch64-Extensions.html#AArch64-Extensions

否则,在Ubuntu 16.04上很容易从源代码编译Binutils,只需这样做:

Otherwise, compiling Binutils from source is easy on Ubuntu 16.04, just do:

git clone git://sourceware.org/git/binutils-gdb.git
cd binutils-gdb
# master that I tested with.
git checkout 4de5434b694fc260d02610e8e7fec21b2923600a
./configure --target aarch64-elf --prefix "$(pwd)/ble"
make -j `nproc`
make install

我没有签出标签,因为最后一个标签已经使用了几个月,而且我不希望在引入SVE时对日志消息进行grep;-)

I didn't check out to a tag because the last tag is a few months old, and I don't feel like grepping log messages for when SVE was introduced ;-)

然后使用已编译的as并与Ubuntu 16.04上的打包GCC链接:

Then use the compiled as and link with the packaged GCC on Ubuntu 16.04:

./binutils-gdb/ble/bin/aarch64-elf-as -c -march=armv8.5-a+sve \
    -o example1.o example1.S
aarch64-linux-gnu-gcc -march=armv8.5-a -nostdlib -o example1 example1.o

在Ubuntu 16.04上,aarch64-linux-gnu-gcc 5.4没有-march=armv8.5-a,因此只需使用-march=armv8-a,就可以了.无论如何,Ubuntu 16.04和18.04都没有-march=armv8-a+sve,它是到达时的最佳选择.

On Ubuntu 16.04, aarch64-linux-gnu-gcc 5.4 does not have -march=armv8.5-a, so just use -march=armv8-a and it should be fine. In any case, neither Ubuntu 16.04 nor 18.04 has -march=armv8-a+sve which will be the best option when it arrives.

或者,您也可以在.S源代码的开头添加以下内容,而不是传递-march=armv8.5-a+sve:

Alternatively, instead of passing -march=armv8.5-a+sve, you can also add the following to the start of the .S source code:

.arch armv8.5-a+sve

在Ubuntu 19.04 Binutils 2.32上,我还了解并测试了:

On Ubuntu 19.04 Binutils 2.32, I also learnt about and tested:

aarch64-linux-gnu-as -march=all

它也适用于SVE,我想我将来会使用更多的功能,因为它似乎可以一次性启用所有功能,而不仅仅是SVE!

which also works for SVE, I think I'll be using more of that in the future, as it seems to just enable all features in one go, not just SVE!

QEMU模拟

在QEMU上逐步调试的过程在以下位置进行了说明:

The procedure to step debug it on QEMU is explained at: How to single step ARM assembly in GDB on QEMU?

首先,我将示例制作成一个最小的自包含Linux可执行文件:

First I made the example into a minimal self contained Linux executable:

.data
    x: .double        1.5,  2.5,  3.5,  4.5
    y: .double        5.0,  6.0,  7.0,  8.0
    y_expect: .double 8.0, 11.0, 14.0, 17.0
    a: .double        2.0
    n: .word          4

.text
.global _start
_start:
    ldr x0, =x
    ldr x1, =y
    ldr x2, =a
    ldr x3, =n
    bl daxpy

    /* exit */
    mov x0, #0
    mov x8, #93
    svc #0


/* Multiply by a scalar and add.
 *
 * Operation:
 *
 *      Y += a * X
 *
 * C signature:
 *
 *      void daxpy(double *x, double *y, double *a, int *n)
 *
 * The name "daxpy" comes from LAPACK:
 * http://www.netlib.org/lapack/explore-html/de/da4/group__double__blas__level1_ga8f99d6a644d3396aa32db472e0cfc91c.html
 *
 * Adapted from: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf
 */
daxpy:
    ldrsw x3, [x3]
    mov x4, #0
    whilelt p0.d, x4, x3
    ld1rd z0.d, p0/z, [x2]
.loop:
    ld1d z1.d, p0/z, [x0, x4, lsl #3]
    ld1d z2.d, p0/z, [x1, x4, lsl #3]
    fmla z2.d, p0/m, z1.d, z0.d
    st1d z2.d, p0, [x1, x4, lsl #3]
    incd x4
    whilelt p0.d, x4, x3
    b.first .loop
    ret

您可以通过以下方式运行它:

You can run it with:

qemu-aarch64 -L /usr/aarch64-linux-gnu -E LD_BIND_NOW=1 ./example1

然后它很好地退出.

接下来,我们可以逐步调试以确认总和是实际产生的:

Next, we can step debug to confirm that the sum was actually made:

qemu-aarch64 -g 1234 -L /usr/aarch64-linux-gnu -E LD_BIND_NOW=1 ./example1

和:

./binutils-gdb/ble/bin/aarch64-elf-gdb -ex 'file example1' \
  -ex 'target remote localhost:1234' -ex 'set sysroot /usr/aarch64-linux-gnu'

现在,在bl daxpy之后右上,然后运行:

Now, step up to right after bl daxpy, and run:

>>> p (double[4])y_expect
$1 = {[0] = 8, [1] = 11, [2] = 14, [3] = 17}
>>> p (double[4])y
$2 = {[0] = 8, [1] = 11, [2] = 14, [3] = 17}

确认总和确实按预期完成.

which confirms that the sum was actually done as expected.

观察SVE寄存器似乎未实现,因为我在以下位置找不到任何内容: https://github.com/qemu/qemu/tree/v3.0.0/gdb-xml ,但是通过复制其他FP寄存器来实现它应该不会太难吗?在以下位置提问: http://lists.nongnu.org/archive/html/qemu-discuss/2018-10/msg00020.html

Observing SVE registers seems unimplemented as I can't find anything under: https://github.com/qemu/qemu/tree/v3.0.0/gdb-xml but it should not be too hard to implement by copying other FP registers? Asked at: http://lists.nongnu.org/archive/html/qemu-discuss/2018-10/msg00020.html

您当前可以通过执行以下操作来部分或间接地观察它:

You can currently already observe it partially and indirectly by doing:

i r d0 d1 d2

因为SVE寄存器zX的第一个条目与较早的vX FP寄存器共享,但我们根本看不到p.

because the first entry of SVE register zX is shared with the older vX FP registers, but we can't see p at all.

这篇关于如何用GNU GAS或LLVM汇编ARM SVE指令并在QEMU上运行它?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 19:33