问题描述
基于Matlab /八度算法的例子:
Matlab/Octave algorithm example:
input vector: [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ]
output vector: [ 1 1 2 2 7 7 7 7 5 5 5 5 9 ]
的算法非常简单:它穿过载体和替换为最后非零值全零。它似乎微不足道,并且因此,当以慢于完成(ⅰ= 1:长度)循环,并能够参考previous元件(I-1),但看起来不可能在快速量化形式被配制。
我试图合并()和shift(),但它仅适用于零的第一次出现,不是他们任意数量。
The algorithm is very simple: it goes through the vector and replaces all zeros with the last non-zero value. It seems trivial, and is so when done with a slow for (i=1:length) loop and being able to refer to the previous element (i-1), but looks impossible to be formulated in the fast vectorized form.I tried the merge() and shift() but it only works for the first occurrence of zero, not an arbitrary number of them.
是否可以在八度/ Matlab的一个量化的形式完成的,或者必须C下使用这对大数据量足够的性能?
Can it be done in a vectorized form in Octave/Matlab or must C be used for this to have sufficient performance on big amount of data?
谢谢,
帕维尔
Thanks,Pawel
PS:我有,它似乎一般不可能指$ P $以量化的形式pvious值,像SQL滞后()或group by或环(I-1)将很容易做到。但八度/ Matlab的循环是非常缓慢的。
PS: I have another similar slow for-loop algorithm to speed up and it seems generally impossible to refer to previous values in a vectorized form, like an SQL lag() or group by or loop (i-1) would easily do. But Octave/Matlab loops are terribly slow.
有没有人找到了解决这一普遍问题,或者这是徒劳的基本八度/ Matlab的设计的原因?
Has anyone found a solution to this general problem or is this futile for fundamental Octave/Matlab design reasons?
==========编辑===============
========== EDIT ===============
业绩比较基准:
====解决方案1(慢环)
==== SOLUTION 1 (slow loop)
in = out = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
tic; for i=2:length(out) if (out(i)==0) out(i)=out(i-1); endif; endfor; toc;
[in(1:20); out(1:20)] # test to show side by side if ok
Elapsed time is 15.047 seconds.
====(快〜80倍)解决方案2丹
==== SOLUTION 2 by Dan (~80 times faster)
in = V = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
tic;
d = double(diff([0,V])>0);
d(find(d(2:end))+1) = find(diff([0,~V])==-1) - find(diff([0,~V])==1);
out = V(cumsum(~~V+d)-1);
toc;
[in(1:20); out(1:20)] # shows it works ok
Elapsed time is 0.188167 seconds.
# 15.047 / 0.188167 = 79.97 times improvement
====(快〜115倍)解决方案由3 GameOfThrows
==== SOLUTION 3 by GameOfThrows (~115 times faster)
in = a = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
tic;
pada = [a,888];
b = pada(find(pada >0));
bb = b(:,1:end-1);
c = find (pada==0);
d = find(pada>0);
length = d(2:end) - (d(1:end-1));
t = accumarray(cumsum([1,length])',1);
out = R = bb(cumsum(t(1:end-1)));
toc;
Elapsed time is 0.130558 seconds.
# 15.047 / 0.130558 = 115.25 times improvement
==== 神奇解决方案4路易斯Mendo (快〜250倍)
==== Magical SOLUTION 4 by Luis Mendo (~250 times faster)
的更新,以整齐的一行代码的
in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] , 1, 100000);
tic;
out = nonzeros(in).'(cumsum(in~=0));
toc;
Elapsed time is 0.0597501 seconds.
# 15.047 / 0.0597501 = 251.83 times improvement
的丹,GameOfThrows和路易斯 - 我非常AP preciate你的快速,敏锐而有效的帮助,这种情况下。这些都是具有优异的加速伟大的解决方案。
我很惊讶这样的改进是可能的,我现在将发布第二个挑战。我首先决定跳过它,因为我认为这是更加困难和遥不可及,但什么这方面的证据表明 - 我希望我是错了的
另请参阅:
推荐答案
以下简单的方法你想要做什么,并可能是非常快的:
The following simple approach does what you want, and is probably very fast:
in = [1 0 2 0 7 7 7 0 5 0 0 0 9];
t = cumsum(in~=0);
u = nonzeros(in);
out = u(t).';
这篇关于由previous非零值替换矢量全部为零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!