问题描述
我有以下代码用于插值3D体数据.
I have following code used to interpolate 3D volume data.
Y, X, Z = np.shape(volume)
xs = np.arange(0, X)
ys = np.arange(0, Y)
zs = np.arange(0, Z)
points = list(zip(np.ravel(result[:, :, :, 1]), np.ravel(result[:, :, :, 0]), np.ravel(result[:, :, :, 2])))
interp = interpolate.RegularGridInterpolator((ys, xs, zs), volume,
bounds_error=False, fill_value=0, method='linear')
new_volume = interp(points)
new_volume = np.reshape(new_volume, (Y, X, Z))
此代码在512x512x110体积(约2900万个点)上执行大约需要37秒,这导致每个体素超过1微秒(对于我来说这是无法接受的时间-它使用4核).调用 new_volume = interp(points)
大约需要80%的时间,而列表创建几乎要花费全部剩余时间.
This code takes about 37 seconds to execute on 512x512x110 volume (about 29 millions of points), which results in more than one microsecond per voxel (which is unacceptable amount of time for me - what is more it uses 4 cores). Call new_volume=interp(points)
takes about 80% of the prodecure time and the list creation almost whole remaining time.
是否有任何简单(或更复杂)的方法来加快计算速度?还是有什么好的Python库可以提供更快的插值?在每次调用此程序时,我的音量和积分都会改变.
Is there any simple (or even more complex) way to make this computation faster? Or is there any good Python library, which provides faster interpolation? My volume and points change in every call to this prodecure.
推荐答案
以下是您的 cython
解决方案的稍作修改的版本:
Here is slightly modified version of your cython
solution:
import numpy as np
cimport numpy as np
from libc.math cimport floor
from cython cimport boundscheck, wraparound, nonecheck, cdivision
DTYPE = np.float
ctypedef np.float_t DTYPE_t
@boundscheck(False)
@wraparound(False)
@nonecheck(False)
def interp3D(DTYPE_t[:,:,::1] v, DTYPE_t[:,:,::1] xs, DTYPE_t[:,:,::1] ys, DTYPE_t[:,:,::1] zs):
cdef int X, Y, Z
X,Y,Z = v.shape[0], v.shape[1], v.shape[2]
cdef np.ndarray[DTYPE_t, ndim=3] interpolated = np.zeros((X, Y, Z), dtype=DTYPE)
_interp3D(&v[0,0,0], &xs[0,0,0], &ys[0,0,0], &zs[0,0,0], &interpolated[0,0,0], X, Y, Z)
return interpolated
@cdivision(True)
cdef inline void _interp3D(DTYPE_t *v, DTYPE_t *x_points, DTYPE_t *y_points, DTYPE_t *z_points,
DTYPE_t *result, int X, int Y, int Z):
cdef:
int i, x0, x1, y0, y1, z0, z1, dim
DTYPE_t x, y, z, xd, yd, zd, c00, c01, c10, c11, c0, c1, c
dim = X*Y*Z
for i in range(dim):
x = x_points[i]
y = y_points[i]
z = z_points[i]
x0 = <int>floor(x)
x1 = x0 + 1
y0 = <int>floor(y)
y1 = y0 + 1
z0 = <int>floor(z)
z1 = z0 + 1
xd = (x-x0)/(x1-x0)
yd = (y-y0)/(y1-y0)
zd = (z-z0)/(z1-z0)
if x0 >= 0 and y0 >= 0 and z0 >= 0:
c00 = v[Y*Z*x0+Z*y0+z0]*(1-xd) + v[Y*Z*x1+Z*y0+z0]*xd
c01 = v[Y*Z*x0+Z*y0+z1]*(1-xd) + v[Y*Z*x1+Z*y0+z1]*xd
c10 = v[Y*Z*x0+Z*y1+z0]*(1-xd) + v[Y*Z*x1+Z*y1+z0]*xd
c11 = v[Y*Z*x0+Z*y1+z1]*(1-xd) + v[Y*Z*x1+Z*y1+z1]*xd
c0 = c00*(1-yd) + c10*yd
c1 = c01*(1-yd) + c11*yd
c = c0*(1-zd) + c1*zd
else:
c = 0
result[i] = c
结果仍然与您相同.使用 60x60x60
的随机网格数据,我可以获得以下时序:
The results are still identical to yours. With a random grid data of 60x60x60
I obtain the following timings:
SciPy's solution: 982ms
Your cython solution: 24.7ms
Above modified cython solution: 8.17ms
因此,其速度比 cython
解决方案快近4倍.请注意
So its nearly 4 times faster than your cython
solution. Note that
- Cython默认对数组执行边界检查,如果要使用该功能,请打开边界检查,然后删除
@boundscheck(False)
. - 在上述解决方案中,数组必须是C连续的
- 如果需要上述代码的并行变体,请在
for循环
中替换range
而不是prange
.
- Cython performs bounds checking by default on arrays and if you want that feature then turn on boundschecking remove
@boundscheck(False)
. - In the above solution the arrays are required to be C-contiguous
- If you want a parallel variant of the above code, replace
range
instead ofprange
in yourfor loop
.
希望这会有所帮助.
这篇关于Python 3D插值加速的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!