本文介绍了如何使用numpy查找两个非常大的矩阵的行之间的成对差异?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
给出两个矩阵,我想计算所有行之间的成对差异.每个矩阵有1000行和100列,因此它们相当大.我尝试使用for循环和纯广播,但是for循环似乎运行得更快.难道我做错了什么?这是代码:
Given two matrices, I want to compute the pairwise differences between all rows. Each matrix has 1000 rows and 100 columns so they are fairly large. I tried using a for loop and pure broadcasting but the for loop seem to be working faster. Am I doing something wrong? Here is the code:
from numpy import *
A = random.randn(1000,100)
B = random.randn(1000,100)
start = time.time()
for a in A:
sum((a - B)**2,1)
print time.time() - start
# pure broadcasting
start = time.time()
((A[:,newaxis,:] - B)**2).sum(-1)
print time.time() - start
广播方法大约需要1秒钟的时间,而大型矩阵的广播方法甚至需要更长的时间.知道如何仅使用numpy加快速度吗?
The broadcasting method takes about 1 second longer and it's even longer for large matrices. Any idea how to speed this up purely using numpy?
推荐答案
这是另一种执行方式:
第一个使用 np.einsum
两个词,第三个词dot-product
-
import numpy as np
np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) - 2*np.dot(A,B.T)
运行时测试
方法-
def loopy_app(A,B):
m,n = A.shape[0], B.shape[0]
out = np.empty((m,n))
for i,a in enumerate(A):
out[i] = np.sum((a - B)**2,1)
return out
def broadcasting_app(A,B):
return ((A[:,np.newaxis,:] - B)**2).sum(-1)
# @Paul Panzer's soln
def outer_sum_dot_app(A,B):
return np.add.outer((A*A).sum(axis=-1), (B*B).sum(axis=-1)) - 2*np.dot(A,B.T)
# @Daniel Forsman's soln
def einsum_all_app(A,B):
return np.einsum('ijk,ijk->ij', A[:,None,:] - B[None,:,:], \
A[:,None,:] - B[None,:,:])
# Proposed in this post
def outer_einsum_dot_app(A,B):
return np.einsum('ij,ij->i',A,A)[:,None] + np.einsum('ij,ij->i',B,B) - \
2*np.dot(A,B.T)
时间-
In [51]: A = np.random.randn(1000,100)
...: B = np.random.randn(1000,100)
...:
In [52]: %timeit loopy_app(A,B)
...: %timeit broadcasting_app(A,B)
...: %timeit outer_sum_dot_app(A,B)
...: %timeit einsum_all_app(A,B)
...: %timeit outer_einsum_dot_app(A,B)
...:
10 loops, best of 3: 136 ms per loop
1 loops, best of 3: 302 ms per loop
100 loops, best of 3: 8.51 ms per loop
1 loops, best of 3: 341 ms per loop
100 loops, best of 3: 8.38 ms per loop
这篇关于如何使用numpy查找两个非常大的矩阵的行之间的成对差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!