问题描述
我正在尝试将本机python函数的一部分转换为cython,以缩短计算时间。我想只为占用时间的循环组件编写一个cython函数(就像ipython lprun告诉我的那样)。但是,此函数需要使用大小可变的矩阵..我看不到如何轻松将其应用于静态类型的cython。
I am trying to convert part of a native python function to cython to improve the compute time. I would like to write a cython function just for the loop component that is taking up the time (as ipython lprun kindly told me). However this function takes in variably sized matrices .. and I can't see how to bring that across easily to statically typed cython.
for index1 in range(0,num_products):
for index2 in range(0,num_products):
cond_prob = (data[index1] * data[index2]).sum() / max(col_sums[index1], col_sums[index2])
prox[index1][index2] = cond_prob
此问题是num_products逐年更改,因此矩阵(数据)大小可变。
This issue is that num_products changes year to year, so the matrix (data) size is variable.
这里最好的策略是什么?
What is the best strategy here?
- 我应该编写两个C函数。一个使用memalloc创建一个特定维度的矩阵,然后一个在创建的矩阵上进行循环?
- 在这种情况下是否有一些漂亮的cython / numpy向导可以帮助您?我可以编写一个C函数来在内存中接收大小可变的Numpy数组并传递该大小吗?
推荐答案
Cython代码(从策略上)是静态类型的,但这并不意味着数组必须具有固定的大小。在直接C中,将多维数组传递给函数可能有点尴尬,但是在Cython中,您应该可以执行以下操作:
Cython code is (strategically) statically typed, but that doesn't mean that arrays must have a fixed size. In straight C passing a multidimensional array to a function can be a little awkward maybe, but in Cython you should be able to do something like the following:
请注意,我接受了您的
Note I took the function and variable names from your follow-up question.
import numpy as np
cimport numpy as np
cimport cython
@cython.boundscheck(False)
@cython.cdivision(True)
def cooccurance_probability_cy(double[:,:] X):
cdef int P, i, j, k
P = X.shape[0]
cdef double item
cdef double [:] CS = np.sum(X, axis=1)
cdef double [:,:] D = np.empty((P, P), dtype=np.float)
for i in range(P):
for j in range(P):
item = 0
for k in range(P):
item += X[i,k] * X[j,k]
D[i,j] = item / max(CS[i], CS[j])
return D
在其他情况下反之,如果您使用正确的功能并进行一些广播,则仅使用Numpy来解决此问题也应该很快。实际上,由于计算复杂度由矩阵乘法决定,所以我发现以下内容比上面的Cython代码快得多( np.inner
使用高度优化的BLAS例程) :
On the other hand, using just Numpy should also be quite fast for this problem, if you use the right functions and some broadcasting. In fact, as the calculation complexity is dominated by the matrix multiplication, I found the following is much faster than the Cython code above (np.inner
uses a highly optimized BLAS routine):
def new(X):
CS = np.sum(X, axis=1, keepdims=True)
D = np.inner(X,X) / np.maximum(CS, CS.T)
return D
这篇关于具有可变尺寸矩阵输入的Cython函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!