本文介绍了使用张量点实现批矩阵乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试仅使用张量点,点和重塑等方法来实现与np.matmul并行矩阵乘法相同的行为.

I'm trying to achieve the same behaviour as np.matmul parallel matrix multiplication using just tensordot,dot and reshaping etc.

我将其转换为使用的库没有支持并行乘法的matmul,仅支持点和张量点.

The library I am translating this to using does not have a matmul that supports parallel multiplication, only dot and tensordot.

另外,我想避免在第一维上进行迭代,并希望使用一组矩阵乘法和重塑来进行此操作(希望使用BLAS/GPU来运行其中的尽可能多的操作,因为我有大量的小矩阵需要计算)平行).

Additionally I want to avoid iterating over the first dimension, and want to do this using a set of matrix multiplications and reshaping (want as much of it to run using BLAS/GPU as i have large numbers of small matrices to calculate in parallel).

这里是一个例子:

import numpy as np

angles = np.array([np.pi/4, 2*np.pi/4, 2*np.pi/4])

vectors = np.array([ [1,0],[1,-1],[-1,0]])

s = np.sin(angles)
c = np.cos(angles)

rotations = np.array([[c,s],[-s,c]]).T

print rotations

print vectors

print("Correct: %s" % np.matmul(rotations, vectors.reshape(3,2,1)))

# I want to do this using tensordot/reshaping, i.e just gemm BLAS operations underneath
print("Wrong: %s" % np.tensordot(rotations, vectors, axes=(1,1)))

此输出为:

Correct: [[[  7.07106781e-01]
  [  7.07106781e-01]]

 [[  1.00000000e+00]
  [  1.00000000e+00]]

 [[ -6.12323400e-17]
  [ -1.00000000e+00]]]


Wrong: [[[  7.07106781e-01   1.11022302e-16  -7.07106781e-01]
  [ -7.07106781e-01  -1.41421356e+00   7.07106781e-01]]

 [[  6.12323400e-17  -1.00000000e+00  -6.12323400e-17]
  [ -1.00000000e+00  -1.00000000e+00   1.00000000e+00]]

 [[  6.12323400e-17  -1.00000000e+00  -6.12323400e-17]
  [ -1.00000000e+00  -1.00000000e+00   1.00000000e+00]]]

有没有一种方法可以修改第二个表达式,从而获得与第一个表达式相同的结果,只需使用点/张量点即可.

Is there a way in which I can modify the second expression in order to get the same result as the first, just using dot/tensordot.

我相信这是有可能的,并且已经在网上看到了一些评论,但从来没有任何示例

I believe it is possible, and have seen some comments online, but never any examples

推荐答案

我们需要保持一个对齐,并在输出中也保持对齐.因此,tensordot/dot在这里不起作用. More info on tensordot 可能以某种方式解释了为什么不会.但是,我们可以使用 np.einsum ,根据我的经验,在大多数情况下,它被认为比np.matmul快一点.

We need to keep one aligned and keep that also at the output. So, tensordot/dot won't work here. More info on tensordot might explain it somehow on why it won't. But, we can use np.einsum, which in most cases (in my experience) is seen to be marginally faster than np.matmul.

实现看起来像这样-

np.einsum('ijk,ik->ij',rotations, vectors)

此外,似乎所需的输出具有一个尾随的单例调光.因此,用None/np.newaxis在那里添加一个新轴,就像这样-

Also, it seems the desired output has one trailing singleton dim. So, append a new axis there with None/np.newaxis, like so -

np.einsum('ijk,ik->ij',rotations, vectors)[...,None]

这篇关于使用张量点实现批矩阵乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!