如何做总和平方的总和

如何做总和平方的总和

本文介绍了如何做总和平方的总和?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个总和,我想加快速度.在一种情况下是:

I have a sum of sums that I want to speed up. In one case it is:

S_ {x,y,k,l} Fu_ {ku} Fv_ {lv} Fx_ {kx} Fy_ {ly}

S_{x,y,k,l} Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly}

在另一种情况下是:

S_ {x,y}(S_ {k,l} Fu_ {ku} Fv_ {lv} Fx_ {kx} Fy_ {ly})^ 2

S_{x,y} ( S_{k,l} Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly} )^2

注意:S_ {indices}:是这些索引上的总和

第一种情况,我想出了如何使用numpy的einsum进行操作,并导致了惊人的加速〜x160.

The first case I have figured out how to do using numpy's einsum and it results in an amazing speedup ~ x160.

此外,我已经考虑过尝试扩大正方形,但这不是杀手,因为我需要对x,y,k,l,k,l求和而不是x,y,k,l求和?

Also, I have thought of trying to expand the square but won't that be a killer as I would need to sum over x,y,k,l,k,l instead of x,y,k,l?

这是一个演示实现与einsum的区别和解决方案的实现.

Here is an implementation that demonstrates the difference and the solution I have with einsum.

Nx = 3
Ny = 4
Nk = 5
Nl = 6
Nu = 7
Nv = 8
Fx = np.random.rand(Nx, Nk)
Fy = np.random.rand(Ny, Nl)
Fu = np.random.rand(Nu, Nk)
Fv = np.random.rand(Nv, Nl)
P = np.random.rand(Nx, Ny)
B = np.random.rand(Nk, Nl)
I1 = np.zeros([Nu, Nv])
I2 = np.zeros([Nu, Nv])
t = time.time()
for iu in range(Nu):
    for iv in range(Nv):
        for ix in range(Nx):
            for iy in range(Ny):
                S = 0.
                for ik in range(Nk):
                    for il in range(Nl):
                        S += Fu[iu,ik]*Fv[iv,il]*Fx[ix,ik]*Fy[iy,il]*P[ix,iy]*B[ik,il]
                I1[iu, iv] += S
                I2[iu, iv] += S**2.
print time.time() - t; t = time.time()
# 0.0787379741669
I1_ = np.einsum('uk, vl, xk, yl, xy, kl->uv', Fu, Fv, Fx, Fy, P, B)
print time.time() - t
# 0.00049090385437
print np.allclose(I1_, I1)
# True
# Solution by expanding the square (not ideal)
t = time.time()
I2_ = np.einsum('uk,vl,xk,yl,um,vn,xm,yn,kl,mn,xy->uv', Fu,Fv,Fx,Fy,Fu,Fv,Fx,Fy,B,B,P**2)
print time.time() - t
# 0.0226809978485 <- faster than for loop but still much slower than I1_ einsum
print np.allclose(I2_, I2)
# True

如图所示,我设法完成了I1_的工作,我想出了如何对I1使用einsum进行上述操作.

As shown I've managed to do I1_ with I've figured out how to do the above with einsum for I1.

我添加了如何通过扩大平方来执行I2_的操作,但是速度有些令人失望,并且可以预期...与〜x160相比,〜x3.47的加速效果

I added how to do I2_ by expanding the square but the speed up is a bit disappointing and to be expected... ~x3.47 speedup compared to ~x160

加速似乎不一致,我在x40和x1.2之前就已经获得了,但是现在却得到了不同的数字.无论哪种方式,差异和问题都将保留.

The speedups don't seem to be consistent, I had gotten before a x40 and an x1.2 but now get different numbers. Either way the difference and the question remain.

我试图简化我实际得到的总和,但搞砸了,上面的总和允许@ user5402提供出色的答案.

I tried to simplify the sum I'm actually after but messed up and the sum above allows for the excellent answer provided by @user5402.

我已经编辑了上面的代码以演示下面的总和:

I've edited the code above to demonstrate the sum which is below:

I1 = S_ {x,y,k,l} Fu_ {ku} Fv_ {lv} Fx_ {kx} Fy_ {ly} P_ {xy} B_ {kl}

I1 = S_{x,y,k,l} Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly} P_{xy} B_{kl}

I2 = S_ {x,y}(S_ {k,l} Fu_ {ku} Fv_ {lv} Fx_ {kx} Fy_ {ly} P_ {xy} B_ {kl})^ 2

I2 = S_{x,y} ( S_{k,l} Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly} P_{xy} B_{kl} )^2

推荐答案

由于问题已更改,我将开始一个新的答案.

I'll start a new answer since the problem has changed.

尝试一下:

E = np.einsum('uk, vl, xk, yl, xy, kl->uvxy', Fu, Fv, Fx, Fy, P, B)
E1 = np.einsum('uvxy->uv', E)
E2 = np.einsum('uvxy->uv', np.square(E))

我发现它的运行速度与I1_一样快.

I've found it runs just as fast as the time for I1_.

这是我的测试代码: http://pastebin.com/ufwy7cLy

这篇关于如何做总和平方的总和?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 16:28