为什么短路逻辑运算符应该更快

为什么短路逻辑运算符应该更快

本文介绍了为什么短路逻辑运算符应该更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题不是关于优化代码的问题,而是一个关于短路逻辑运算符和普通逻辑运算符的性能差异的技术性问题,这些差异可能归结于它们在硬件级别上的执行方式.

This question is not about optimizing code, but its a technical question about performance difference of short circuit logical operators and normal logical operators which may go down to how they are performed on hardware level.

基本逻辑 AND OR 占用一个周期,而短路评估则使用分支,并可能占用各种周期.现在我知道分支预测变量可以使此评估高效,但是我不知道它的循环速度如何快于1个周期?

Basically logical AND and OR takes one cycle whereas short circuit evaluation uses branching and can take various amount of cycles. Now I know that branch predictors can make this evaluation efficient but I don't see how its faster than 1 cycle?

是的,如果正确的操作数是昂贵的东西,那么尝试不进行评估是有益的.但对于诸如 X&(Y | Z),假设这些是原子变量,则非短路逻辑运算符的执行速度可能会更快.我说的对吗?

Yes, if right operand is something expensive then trying to not evaluate it is beneficial. but for simple conditions like X & (Y | Z), assuming these are atomic variables, non short circuit logical operators would perform likely faster. Am I right?

我假设短路逻辑运算符使用分支(没有官方资料,只是自言自语),因为在执行顺序指令时您还如何进行这些跳转?

I assumed that short circuit logical operators use branching (no official source, just self thought), because how else you make those jumps while executing instructions in order?

推荐答案

这已经很晚了,但是由于尚未得到答复(...),因此我将继续讨论.

This is very late but since this hasn't been answered yet (...), I'm going to have a go at it.

您已经指出了分支预测,这在本质上是正确的.现代硬件还存在其他硬件相关问题,主要与指令级并行性和操作相互依赖性有关.

You already pointed out the branch prediction, which is inherently true.There are also other hardware related issues on modern hardware, which are mostly related to instruction level parallelism and operational interdependencies.

如果a为假,短路操作员需要对A和THEN B进行评估,而对B不进行评估.由于推测性执行,这将导致我们回到分支和CPU管道刷新.如果需要连续检查更多的条件,这可能会导致/获得更高的成本.另一方面,由于CPU可以评估许多"处理器,因此通过非短路操作可以降低成本.由于存在多个物理ALU/FPU/AGU等,所以指令可以在同一时钟周期内完成.

A short circuit operator requires A and THEN B to be evaluated and B not to be evaluated in case a is false. This leads us back to branches and CPU pipeline flushes due to speculative execution. This can get/gets more costly the more conditions need to be checked in succession. On the other hand, this can get cheaper with non-short circuit operations, since CPUs can evalutate "many" instructions in the same clock cycle, thanks to multiple physical ALUs/FPUs/AGUs etc. being present.

为了说明这一点,让我们看一下Assembly中最简单的情况:

And to drive this point home lets look at the simplest case in Assembly:

a && b:

cmp    a, 0
jne    LABEL_A
---more code---
LABEL_A:
cmp    b, 0
jne    RETURN_LABEL
 ---more code---

与...相反(假设使用了setb之类的指令来钳制[0,1])

as opposed to... (assuming instructions like setb were used to clamp to [0, 1])

a & b

and   a, a, b
cmp   a, 0
jne   RETURN_LABEL
---more code---

这在生成的程序集本身中应该是不言而喻的.但是,是的,您说的很对,如果A为假,则绝对应使用短路以避免昂贵的计算B.但是即使这样,CPU仍可能会推测性地对B执行测试.因此,基本上,很简单地说,您可以仅通过使用短路运算符使情况变得更糟(原文如此!!!!)".

This should be self-evident in the resulting assembly itself.But yes, you are right in saying that you should definitely use short-circuiting to avoid expensive calculation B in case A is false. But even then the CPU might speculatively execute the test for B anyway. So basically, very simply said, you can "only make things worse by using short circuiting operators(sic!!!!!)".

这篇关于为什么短路逻辑运算符应该更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-16 00:40