本文介绍了n是负数,正数还是零?返回1、2或4的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建PowerPC解释器,并且效果很好.在Power架构中,条件寄存器CR0(x86上的EFLAGS)几乎针对任何指令进行更新.像这样设置.如果最后一个结果为负,则CR0的值为1,如果最后一个结果为正,则CR0的值为2,否则为4.

I'm building a PowerPC interpreter, and it works quite well. In the Power architecture the condition register CR0 (EFLAGS on x86) is updated on almost any instruction. It is set like this. The value of CR0 is 1, if the last result was negative, 2 if the last result was positive, 4 otherwise.

我第一个解释这一点的天真方法是:

My first naive method to interpret this is:

if (n < 0)
    cr0 = 1
else if (n > 0)
    cr0 = 2;
else
    cr0 = 4;

但是我知道所有这些分支并不是最优的,它们每秒运行数百万次.我已经看到了一些关于SO的黑客技术,但是似乎都没有争论.例如,我发现许多示例将数字分别转换为-1、0或1或符号0.但是如何使-1 = 1、1 = 2、0 = 4?我正在寻求Bit Hackers的帮助...

However I understand that all those branches won't be optimal, being run millions of times per second. I've seen some bit hacking on SO, but none seemed adeguate. For example I found many examples to convert a number to -1, 0, or 1 accordingly to the sign or 0. But how to make -1 = 1, 1 = 2, 0 = 4?I'm asking for the help of the Bit Hackers...

预先感谢

更新:首先:谢谢大家,您真棒.我会仔细测试您的所有代码的速度,您将是第一个知道谁是赢家的人.

Update:First of all: thanks guys, you've been great. I'll test all of your codes carefully for speed and you'll be the first to know who's the winner.

@jalf:关于您的第一个建议,我实际上并不是在每条指令上都计算CR0.我宁愿保留lastResult变量,然后(如果(如果))以下指令要求标记,则进行比较.三个主要动机使我回到了每次"更新:

@jalf: About your first advice, I wasn't actually calculating CR0 on every instruction. I was rather keeping a lastResult variable, and when (and if) a following instruction asked for a flag, do the comparison. Three main motivations took me back to "everytime" update:

  1. 在PPC上,您不必像在x86上那样强制更新CR0(即使不需要,ADD也会始终更改EFLAGS),您有两种添加方式,一种是更新.如果编译器选择使用更新版本,则意味着它将在某个时候使用CR0,因此不会延迟...
  2. 有一条特别痛苦的指令mtcrf,它使您可以任意更改CR0.您甚至可以将其设置为7,没有任何算术意义……这只是破坏了保留"lastResult"变量的可能性.

推荐答案

首先,如果要在(几乎)每条指令之后更新此变量,则显而易见的建议是:

First, if this variable is to be updated after (nearly) every instruction, the obvious piece of advice is this:

不要

仅在后续说明需要它的值时更新它.在任何其他时间,都没有必要对其进行更新.

Only update it when the subsequent instructions need its value. At any other time, there's no point in updating it.

但是无论如何,当我们更新它时,我们想要的是这种行为:

But anyway, when we update it, what we want is this behavior:

R < 0  => CR0 == 0b001
R > 0  => CR0 == 0b010
R == 0 => CR0 == 0b100

理想情况下,我们根本不需要分支.这是一种可能的方法:

Ideally, we won't need to branch at all. Here's one possible approach:

  1. 将CR0设置为值1. (如果您真的想要速度,请研究是否可以在不从内存中获取常量的情况下完成此操作.即使您必须花一些时间在它上面,也很值得)
  2. 如果R> = 0,则向左移一位.
  3. 如果R == 0,则向左移一位
  1. Set CR0 to the value 1. (if you really want speed, investigate whether this can be done without fetching the constant from memory. Even if you have to spend a couple of instructions on it, it may well be worth it)
  2. If R >= 0, left shift by one bit.
  3. If R == 0, left shift by one bit

可以将第2步和第3步转换为消除"if"部分的地方

Where steps 2 and 3 can be transformed to eliminate the "if" part

CR0 <<= (R >= 0);
CR0 <<= (R == 0);

这样更快吗?我不知道.与往常一样,当您关注性能时,需要进行衡量,衡量和衡量.

Is this faster? I don't know. As always, when you are concerned about performance, you need to measure, measure, measure.

但是,我可以看到这种方法的几个优点:

However, I can see a couple of advantages of this approach:

  1. 我们完全避免分支
  2. 我们避免内存加载/存储.
  3. 我们所依赖的指令(移位和比较)应该具有低延迟,例如,乘法并不总是这种情况.

缺点是我们在所有三行之间都有一个依赖关系链:每个都修改CR0,然后在下一行中使用它.这在某种程度上限制了指令级并行性.

The downside is that we have a dependency chain between all three lines: Each modifies CR0, which is then used in the next line. This limits instruction-level parallelism somewhat.

为最小化此依赖链,我们可以改为执行以下操作:

To minimize this dependency chain, we could do something like this instead:

CR0 <<= ((R >= 0) + (R == 0));

所以我们只需在CR0初始化后对其进行一次修改.

so we only have to modify CR0 once, after its initialization.

或者,一行完成所有操作:

Or, doing everything in a single line:

CR0 = 1 << ((R >= 0) + (R == 0));

当然,此主题可能有很多变体,因此请继续尝试.

Of course, there are a lot of possible variations of this theme, so go ahead and experiment.

这篇关于n是负数,正数还是零?返回1、2或4的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 02:44