本文介绍了numpy的批量张量乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试执行以下矩阵和张量乘法,但是是分批进行的.
I am trying to perform the following matrix and tensor multiplication, but batched.
我有一个x
个向量的列表:
I have a list of x
vectors:
x = np.array([[2.0, 2.0], [3.0, 3.0], [4.0, 4.0], [5.0, 5.0]])
以及以下矩阵和张量:
R = np.array(
[
[1.0, 1.0],
[0.0, 1.0],
]
)
T = np.array(
[
[
[2.0, 0.0],
[0.0, 0.0],
],
[
[0.0, 0.0],
[0.0, 2.0],
]
]
)
批处理矩阵乘法相对简单:
The batched matrix multiplication is relatively straightforward:
x.dot(R.T)
但是我在第二部分中苦苦挣扎.
However I am struggling with the second part.
我尝试使用tensordot
,但到目前为止没有成功.我想念什么?
I tried using tensordot
but with no success so far. What am I missing?
推荐答案
由于高速缓存的使用对于一系列小张量而言并不是问题(因为大张量的一般点积会如此),因此很容易制定公式简单循环的问题.
Since cache usage isn't an issue on a sequence of small tensors (as it would be for general dot products of large matrices) it is easy to formulate the problem with simple loops.
示例
import numba as nb
import numpy as np
import time
@nb.njit(fastmath=True,parallel=True)
def tensor_mult(T,x):
res=np.empty((x.shape[0],T.shape[0]),dtype=T.dtype)
for l in nb.prange(x.shape[0]):
for i in range(T.shape[0]):
sum=0.
for j in range(T.shape[1]):
for k in range(T.shape[2]):
sum+=T[i,j,k]*x[l,j]*x[l,k]
res[l,i]=sum
return res
基准化
x = np.random.rand(1000000,6)
T = np.random.rand(6,6,6)
#first call has a compilation overhead (about 0.6s)
res=tensor_mult(T,x)
t1=time.time()
for i in range(10):
#@divakar
#Tx = np.tensordot(T,x,axes=((1),(1)))
#out = np.einsum('ikl,lk->li',Tx,x)
res=tensor_mult(T,x)
print(time.time()-t1)
结果(4C/8T)
Divakars solution: 191ms
Simple loops: 62.4ms
这篇关于numpy的批量张量乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!