问题描述
我是Cython的新手,正在尝试学习如何将其与numpy一起使用以加速代码.我一直在此链接中关注该教程.
I am new to Cython and trying to learn how to use it with numpy to accelerate the code. I have been following the tutorial in this link.
我已经在这里复制了他们的代码:
I have copied their code here:
from __future__ import division
import numpy as np
# "cimport" is used to import special compile-time information
# about the numpy module (this is stored in a file numpy.pxd which is
# currently part of the Cython distribution).
cimport numpy as np
# We now need to fix a datatype for our arrays. I've used the variable
# DTYPE for this, which is assigned to the usual NumPy runtime
# type info object.
DTYPE = np.int
# "ctypedef" assigns a corresponding compile-time type to DTYPE_t. For
# every type in the numpy module there's a corresponding compile-time
# type with a _t-suffix.
ctypedef np.int_t DTYPE_t
# The builtin min and max functions works with Python objects, and are
# so very slow. So we create our own.
# - "cdef" declares a function which has much less overhead than a normal
# def function (but it is not Python-callable)
# - "inline" is passed on to the C compiler which may inline the functions
# - The C type "int" is chosen as return type and argument types
# - Cython allows some newer Python constructs like "a if x else b", but
# the resulting C file compiles with Python 2.3 through to Python 3.0 beta.
cdef inline int int_max(int a, int b): return a if a >= b else b
cdef inline int int_min(int a, int b): return a if a <= b else b
# "def" can type its arguments but not have a return type. The type of the
# arguments for a "def" function is checked at run-time when entering the
# function.
#
# The arrays f, g and h is typed as "np.ndarray" instances. The only effect
# this has is to a) insert checks that the function arguments really are
# NumPy arrays, and b) make some attribute access like f.shape[0] much
# more efficient. (In this example this doesn't matter though.)
cimport cython
@cython.boundscheck(False)
def naive_convolve(np.ndarray[DTYPE_t, ndim=2] f, np.ndarray[DTYPE_t, ndim=2] g):
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported")
assert f.dtype == DTYPE and g.dtype == DTYPE
# The "cdef" keyword is also used within functions to type variables. It
# can only be used at the top indendation level (there are non-trivial
# problems with allowing them in other places, though we'd love to see
# good and thought out proposals for it).
#
# For the indices, the "int" type is used. This corresponds to a C int,
# other C types (like "unsigned int") could have been used instead.
# Purists could use "Py_ssize_t" which is the proper Python type for
# array indices.
cdef int vmax = f.shape[0]
cdef int wmax = f.shape[1]
cdef int smax = g.shape[0]
cdef int tmax = g.shape[1]
cdef int smid = smax // 2
cdef int tmid = tmax // 2
cdef int xmax = vmax + 2*smid
cdef int ymax = wmax + 2*tmid
cdef np.ndarray[DTYPE_t, ndim=2] h = np.zeros([xmax, ymax], dtype=DTYPE)
cdef int s, t
cdef unsigned int x, y, v, w
# It is very important to type ALL your variables. You do not get any
# warnings if not, only much slower code (they are implicitly typed as
# Python objects).
cdef int s_from, s_to, t_from, t_to
# For the value variable, we want to use the same data type as is
# stored in the array, so we use "DTYPE_t" as defined above.
# NB! An important side-effect of this is that if "value" overflows its
# datatype size, it will simply wrap around like in C, rather than raise
# an error like in Python.
cdef DTYPE_t value
for x in range(xmax):
for y in range(ymax):
s_from = int_max(smid - x, -smid)
s_to = int_min((xmax - x) - smid, smid + 1)
t_from = int_max(tmid - y, -tmid)
t_to = int_min((ymax - y) - tmid, tmid + 1)
value = 0
for s in range(s_from, s_to):
for t in range(t_from, t_to):
v = <unsigned int>(x - smid + s)
w = <unsigned int>(y - tmid + t)
value += g[<unsigned int>(smid - s), <unsigned int>(tmid - t)] * f[v, w]
h[x, y] = value
return h
有一件事我不明白.我知道cdef
通过链接定义了C类型关于Cython语言基础.但是,上面的示例还在例如cdef DTYPE_t value
的行中定义了一个名为np.int_t的编译时类型,其中DTYPE_t
实际上是np.int_t
.
There is one thing I did not understand. I know that cdef
defines a C type from this link about Cython language basics. However, the example above also defined a compile-time type called np.int_t, for example, in the line where it says cdef DTYPE_t value
, where DTYPE_t
is actually np.int_t
.
我的问题是:np.int
和np.int_t
有什么区别?它类似于python int
与ctypes.c_int
,但更特定于numpy吗?在那种情况下,如果我仅使用cdef int
而不是cdef np.int_t
会一样吗?
My question is: what is the difference between an np.int
and np.int_t
? It that similar to python int
versus ctypes.c_int
, but more specific for numpy? In that case, would it be the same if I simply use cdef int
instead of cdef np.int_t
?
此外,我确实测试了如果将cdef DTYPE_t value
替换为cdef int value
会发生什么情况.结果表明两者之间没有差异.
Also, I did test that what would happen if I replace cdef DTYPE_t value
with cdef int value
. The result shows no difference between them two.
这是原始的cdef DTYPE_t value
1个循环,最好是10个循环:每个循环93.9毫秒
This is for the original cdef DTYPE_t value
1 loops, best of 10: 93.9 ms per loop
这是针对修改后的cdef int value
1个循环,最好是10个循环:每个循环93.8 ms
This is for the modified cdef int value
1 loops, best of 10: 93.8 ms per loop
任何帮助将不胜感激.谢谢!
Any help would be appreciated. Thanks!
推荐答案
np.int
是在Python代码中引用整数dtype
的Python对象. np.int_t
是仅在Cython中存在的C typedef
. (我相信,它对应于C long
,而不是int
.)
np.int
is a Python object that references the integer dtype
in Python code. np.int_t
is a C typedef
that only exists in Cython. (It corresponds to C long
, I believe, not int
.)
这篇关于Cython中C类型和编译时类型的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!