问题描述
Hi Folks, Page
状态:for(i = 0; i< 10; i ++){...} i遍历值
0,1,2,3,4,5,6,7,8,9如果您不关心循环计数器的顺序,
你可以改为:for(i = 10; i--;){...}使用此代码,我通过值9,8,7,6,5,4,3循环
,2,1,0,循环应该更快。这个
有效,因为处理i - 的速度更快。作为测试条件,
表示我是非零?如果是这样,则减少并继续。对于原始的
代码,处理器必须计算从10中减去i。结果
非零吗?如果是这样,增加i并继续。在紧密循环中,这会产生相当大的差异。
在现代优化编译器的基础上,它有多远?并且
它会对嵌入式系统产生重大影响???
谢谢,
-Neo
你真的认为,你认为真实的是真的吗?
没有什么比测试理论的实验了。我刚试过
AVRGCC
void countDown(void){
int i;
for(i = 10; i!= 0; i--)doSomething();
}
void countUp(void){
int i;
for(i = 0; i< 10; i ++)doSomething();
}
生成代码是
000000ce< countDown>:
}
void countDown(void){
ce:cf 93 push r28
d0:df 93 push r29
int i;
for(i = 10; i!= 0; i--)doSomething();
d2:ca e0 ldi r28,0x0A; 10
d4:d0 e0 ldi r29,0x00; 0
d6:0e 94 5d 00致电0xba
da:21 97 sbiw r28,0x01; 1
dc:e1 f7 brne。-8; 0xd6
de:df 91 pop r29
e0:cf 91 pop r28
e2:08 95 ret
000000e4< countUp>:
}
void countUp(void){
e4:cf 93 push r28
e6:df 93 push r29
e8:c9 e0 ldi r28,0x09; 9
ea:d0 e0 ldi r29,0x00; 0
int i;
for(i = 0; i< 10; i ++)doSomething();
ec:0e 94 5d 00叫0xba
f0:21 97 sbiw r28,0x01; 1
f2:d7 ff sbrs r29,7
f4:fb cf rjmp。-10; 0xec
f6:df 91 pop r29
f8:cf 91 pop r28
fa:08 95 ret
倒计时而不是向上计算可以保存一整条指令。它可能会产生一个
的差异。
然而,如果循环中的任何内容,编译器也无法优化
取决于关于''我'的价值。
void countDown(void){
int i;
for(i = 10; i != 0; i--)doSomething(i);
}
void countUp(void){
int i;
for(i = 0; i< 10; i ++)doSomething(i);
}
成为
void countDown(void){
ce:cf 93 push r28
d0:df 93 push r29
int i ;
for(i = 10; i!= 0; i--)doSomething(i);
d2:ca e0 ldi r28,0x0A; 10
d4:d0 e0 ldi r29,0x00; 0
d6:ce 01 movw r24,r28
d8:0e 94 5d 00 call 0xba
dc:21 97 sbiw r28,0x01; 1
de:d9 f7 brne。-10; 0xd6
e0:df 91 pop r29
e2:cf 91 pop r28
e4:08 95 ret
000000e6< countUp>:
}
void countUp(void){
e6:cf 93 push r28
e8:df 93 push r29
int i;
for(i = 0; i< 10; i ++)doSomething(i);
ea:c0 e0 ldi r28,0x00; 0
ec:d0 e0 ldi r29,0x00; 0
ee:ce 01 movw r24,r28
f0:0e 94 5d 00 call 0xba
f4:21 96 adiw r28,0x01; 1
f6:ca 30 cpi r28,0x0A; 10
f8:d1 05 cpc r29,r1
fa:cc f3 brlt。-14; 0xee
fc:df 91 pop r29
fe:cf 91 pop r28
100:08 95 ret
这次有2条额外的指示。我不认为这是一件很重要的事情。展开循环会得到更好的结果。
欢呼,
Al
如果零(或非零)机器指令,许多微处理器都会减少jmp
所以一个不错的优化编译器应该知道这一点并使用它来倒数
到零循环。计数通常需要一个比较,然后是一个jmp零(或
非零),这将稍微慢一些。
Ian
答案是依赖于实现。
写作的一个主要优点如果您愿意,可以使用C语言写出可理解的,可维护的代码。这种手动优化与
相反。如果你真的需要关心一个循环需要多少个b / b
指令周期,那就用汇编语言编写它。否则,对于
为了那些跟在你后面的人,请你可读地编写你的C和
让编译器进行优化。现在,对于大多数正常的编译器来说,大多数编译器都可以尽可能地优化
。操作。
问候,
-
Peter Bushell
Hi Folks,http://www.abarnett.demon.co.uk/tutorial.html#FASTFOR Page
states:for( i=0; i<10; i++){ ... }i loops through the values
0,1,2,3,4,5,6,7,8,9 If you don''t care about the order of the loop counter,
you can do this instead: for( i=10; i--; ) { ... }Using this code, i loops
through the values 9,8,7,6,5,4,3,2,1,0, and the loop should be faster. This
works because it is quicker to process "i--" as the test condition, which
says "is i non-zero? If so, decrement it and continue.". For the original
code, the processor has to calculate "subtract i from 10. Is the result
non-zero? if so, increment i and continue.". In tight loops, this make a
considerable difference.
How far it holds true.. in the light of modern optimizing compilers? and
will it make a significant difference in case of embedded systems???
Thanks,
-Neo
"Do U really think, what U think real is really real?"
There is nothing like an experiment to test a theory. I just tried with
AVRGCC
void countDown(void){
int i;
for(i=10; i!=0; i--) doSomething();
}
void countUp(void){
int i;
for(i=0;i<10;i++) doSomething();
}
The generated code is
000000ce <countDown>:
}
void countDown(void){
ce:cf 93 pushr28
d0:df 93 pushr29
int i;
for(i=10; i!=0; i--) doSomething();
d2:ca e0 ldir28, 0x0A; 10
d4:d0 e0 ldir29, 0x00; 0
d6:0e 94 5d 00 call0xba
da:21 97 sbiwr28, 0x01; 1
dc:e1 f7 brne.-8 ; 0xd6
de:df 91 popr29
e0:cf 91 popr28
e2:08 95 ret
000000e4 <countUp>:
}
void countUp(void){
e4:cf 93 pushr28
e6:df 93 pushr29
e8:c9 e0 ldir28, 0x09; 9
ea:d0 e0 ldir29, 0x00; 0
int i;
for(i=0;i<10;i++) doSomething();
ec:0e 94 5d 00 call0xba
f0:21 97 sbiwr28, 0x01; 1
f2:d7 ff sbrsr29, 7
f4:fb cf rjmp.-10 ; 0xec
f6:df 91 popr29
f8:cf 91 popr28
fa:08 95 ret
Counting down instead of up saves one whole instruction. It could make a
difference I suppose.
However, the compiler cannot optimise as well if anything in the loop
depends on the value of ''i''.
void countDown(void){
int i;
for(i=10; i!=0; i--) doSomething(i);
}
void countUp(void){
int i;
for(i=0;i<10;i++) doSomething(i);
}
Becomes
void countDown(void){
ce:cf 93 pushr28
d0:df 93 pushr29
int i;
for(i=10; i!=0; i--) doSomething(i);
d2:ca e0 ldir28, 0x0A; 10
d4:d0 e0 ldir29, 0x00; 0
d6:ce 01 movwr24, r28
d8:0e 94 5d 00 call0xba
dc:21 97 sbiwr28, 0x01; 1
de:d9 f7 brne.-10 ; 0xd6
e0:df 91 popr29
e2:cf 91 popr28
e4:08 95 ret
000000e6 <countUp>:
}
void countUp(void){
e6:cf 93 pushr28
e8:df 93 pushr29
int i;
for(i=0;i<10;i++) doSomething(i);
ea:c0 e0 ldir28, 0x00; 0
ec:d0 e0 ldir29, 0x00; 0
ee:ce 01 movwr24, r28
f0:0e 94 5d 00 call0xba
f4:21 96 adiwr28, 0x01; 1
f6:ca 30 cpir28, 0x0A; 10
f8:d1 05 cpcr29, r1
fa:cc f3 brlt.-14 ; 0xee
fc:df 91 popr29
fe:cf 91 popr28
100:08 95 ret
This time there are a whole 2 extra instructions. I don''t think this is
such a big deal. Unrolling the loop would give a better result.
cheers,
Al
Many micros have a decrement jmp if zero (or non zero) machine instruction
so a decent optimising compiler should know this and use it in count down
to zero loops. Counting up often needs a compare followed by a jmp zero (or
non zero) which will be a tad slower.
Ian
The answer is "implementation-dependent".
A major advantage of writing in C is that you can, if you choose, write
understandable, maintainable code. This kind of hand-optimisation has the
opposite effect. If you really need to care about exactly how many
instruction cycle a loop takes, code it in assembly language. Otherwise, for
the sake of those that come after you, please write your C readably and
leave the compiler to do the optimisation. These days, most compilers can
optimise almost as well as you can, for most "normal" operations.
Regards,
--
Peter Bushell
http://www.software-integrity.com/
这篇关于更快的()循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!