在python中计算RBF内核最快的方法是什么?

本文介绍了在python中计算RBF内核最快的方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想为具有n行和d列的数据矩阵X计算RBF或高斯"内核.所得的平方核矩阵由下式给出:

I would like to compute an RBF or "Gaussian" kernel for a data matrix X with n rows and d columns. The resulting square kernel matrix is given by:

K[i,j] = var * exp(-gamma * ||X[i] - X[j]||^2)

var和gamma是标量.

在python中最快的方法是什么?

What is the fastest way to do this in python?

调整#2

第二，我们可以利用Scipy支持的blas函数，如果允许的话，可以使用单精度dtype在其双精度精度上显着提高性能.因此，可以使用 SciPy's sgemm 就像这样-

Tweak #2

Secondly, we could leverage Scipy supported blas functions and if allowed use single-precision dtype for noticeable performance improvement over its double precision one. Hence, np.dot(X, X.T) could be computed with SciPy's sgemm like so -

sgemm(alpha=1.0, a=X, b=X, trans_b=True)

关于用gamma重新排列负号的更多调整，让我们可以更多地输入sgemm.另外，我们会将gamma推入alpha术语.

Few more tweaks on rearranging the negative sign with gamma lets us feed more to sgemm. Also, we would push in gamma into the alpha term.

因此，通过这两个优化，我们将有另外两个方法的变体(如果可以这样说的话)，如下所示-

Thus, with these two optimizations, we would have two more variants (if I could put it that way) of the numexpr method, listed below -

from scipy.linalg.blas import sgemm

def app1(X, gamma, var):
    X_norm = -np.einsum('ij,ij->i',X,X)
    return ne.evaluate('v * exp(g * (A + B + 2 * C))', {\
        'A' : X_norm[:,None],\
        'B' : X_norm[None,:],\
        'C' : np.dot(X, X.T),\
        'g' : gamma,\
        'v' : var\
    })

def app2(X, gamma, var):
    X_norm = -gamma*np.einsum('ij,ij->i',X,X)
    return ne.evaluate('v * exp(A + B + C)', {\
        'A' : X_norm[:,None],\
        'B' : X_norm[None,:],\
        'C' : sgemm(alpha=2.0*gamma, a=X, b=X, trans_b=True),\
        'g' : gamma,\
        'v' : var\
    })

运行时测试

基于Numexpr的答案中的一个-

Runtime test

Numexpr based one from your answer post -

def app0(X, gamma, var):
    X_norm = np.sum(X ** 2, axis = -1)
    return ne.evaluate('v * exp(-g * (A + B - 2 * C))', {
            'A' : X_norm[:,None],
            'B' : X_norm[None,:],
            'C' : np.dot(X, X.T),
            'g' : gamma,
            'v' : var
    })

时间和验证-

In [165]: # Setup
     ...: X = np.random.randn(10000, 512)
     ...: gamma = 0.01
     ...: var = 5.0

In [166]: %timeit app0(X, gamma, var)
     ...: %timeit app1(X, gamma, var)
     ...: %timeit app2(X, gamma, var)
1 loop, best of 3: 1.25 s per loop
1 loop, best of 3: 1.24 s per loop
1 loop, best of 3: 973 ms per loop

In [167]: np.allclose(app0(X, gamma, var), app1(X, gamma, var))
Out[167]: True

In [168]: np.allclose(app0(X, gamma, var), app2(X, gamma, var))
Out[168]: True

这篇关于在python中计算RBF内核最快的方法是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！