如何将C函数编译为numpy ufunc并动态加载它?

本文介绍了如何将C函数编译为numpy ufunc并动态加载它?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些Python代码可以自动生成C函数.此函数将一些double作为输入，并返回double，并从中调用C标准库中的各种函数.

I have some Python code that automatically generates a C function. This function takes some doubles as input and returns a double, calling various functions from the C standard library along the way.

我要执行的操作之一是将其编译为numpy ufunc并将其加载到正在运行的Python进程中.我只希望函数以合理的速度在其输入的numpy数组(例如numpy的minimum)上逐元素运行.

One of the things I would like to do with this is compile it into a numpy ufunc and load it into the running Python process. I just want the function to run element-wise on its input numpy arrays, like numpy's minimum for example, at reasonable speed.

令我惊讶的是，我找不到明确的说明或示例以了解如何执行此操作. Numpy在编写扩展方面有明确的说明，但是尚不清楚如何将其加载到当前的Python进程中.使用ctypes可以编译我的函数并加载它，没问题，但是还不清楚如何使其成为ufunc而不是普通的Python函数. Cython也可以做到这一点，如果我使用pyximport，它甚至可以为我构建共享库，这是理想的选择，因为这样我就可以分发它而不必担心如何在另一个系统上构建C代码.但是同样不清楚如何制作ufunc而不是普通函数.

I was surprised that I couldn't find clear instructions or examples how to do this. Numpy has clear instructions on writing extensions, but it's not clear how I could load these into the current Python process. With ctypes I can compile my function and load it, no problem, but it's not clear how to make it a ufunc rather than a normal Python function. Cython can also do this, and if I use pyximport it will even build the shared library for me, which is ideal because then I can distribute it without worrying about how to build the C code on another system. But again it's not clear how to make a ufunc rather than a normal function.

TL; DR:如何获取一个简单的C函数，将其编译为ufunc并动态加载?越简单，样板越少越好.

TL;DR: how can I take a simple C function, compile it into a ufunc, and load it dynamically? The more foolproof and less boilerplate the better.

推荐答案

一个想法可能是将numba用于创建ufuncs 和 cffi 进行编译C代码.

One idea could be to use numba for creating ufuncs and cffi for compiling the c-code.

例如，如果我们想将numpy数组中每个元素的值加倍，即具有以下C函数作为字符串:

For example if we want to double the value of every element in a numpy-array, i.e. having the following C-function as a string:

double f(double a){
    return 2.0*a;
}

可能的解决方案是以下原型:

a possible solution is the following prototype:

import numba as nb
import cffi

def create_ufunc(code):
    # 1. step: compile the C-code and load the resulting extension
    ffibuilder = cffi.FFI()
    ffibuilder.cdef("double f(double a);", override=True)
    built_module=ffibuilder.verify(source=code)
    fun = built_module.f

    # 2. step: create an ufunc out of the compiled C-function
    @nb.vectorize([nb.float64(nb.float64)])
    def f(x):
      return fun(x)
    return f

现在:

import numpy as np
a=np.arange(6).astype(np.float64)
my_f1=create_ufunc("double f(double a){return 2.0*a;}")
my_f1(a)
# array([  0.,   2.,   4.,   6.,   8.,  10.])

或者如果我们想与10.0相乘:

or if we want to multiply with 10.0:

my_f2=create_ufunc("double f(double a){return 10.0*a;}")
# array([  0.,  10.,  20.,  30.,  40.,  50.])

很明显，虽然展示了可能的样机，但还需要一些打磨.例如，尽管紧凑，但 verify 已弃用，并且使用相同代码两次调用create_ufunc会导致警告.

Obviosly, while showing what is possible, this prototype needs some polishing. For example albeit compact, verify is deprecated and calling create_ufunc twice with the same code will lead to a warning.

另一个问题:尽管cffi-functions .不知道这里出了什么问题?解决方法请参见以下内容:在nopython模式下构建的更复杂的版本.

Another issue: the version above does not compile in the nopython-mode, despite the fact that cffi-functions are supported by numba. Not sure what is going wrong here? See further below for a workaround: a more complicated version which builds in nopython mode.

但是，这可能仍然是一个很好的起点.

However, this is probably still a good starting point.

如果我们使用离线(compile)而不是在线(verify)API模式:

It seems to be possible to compile numba in nopython-mode, if we use out-of-line (compile) instead of in-line (verify) API-mode:

import numba as nb
import cffi
import zlib
import importlib
import numba.cffi_support as nbcffi

def create_ufunc(code):
    # 1. step: compile the C-code and load the resulting extension
    # create a different so/dll for different codes
    # and load it
    module_name="myufunc"+str(zlib.adler32(code.encode('ascii')))
    ffibuilder = cffi.FFI()
    ffibuilder.cdef("double f(double a);", override=True)
    ffibuilder.set_source(module_name=module_name,source=code)
    ffibuilder.compile(verbose=True)
    loaded = importlib.import_module(module_name)


    # 2. step: create an ufunc out of the compiled C-function
    # out-of-line modules must be registered in numba:
    nbcffi.register_module(loaded)
    fun = loaded.lib.f

    @nb.vectorize([nb.float64(nb.float64)], nopython=True)
    def f(x):
      return fun(x)
    return f

重要详细信息:

每个code都有一个新的扩展名(so/pyd文件).我们通过传递的code的哈希值来区分它们.
随着时间的流逝，会有很多myufuncXXXX.so文件，人们可能会考虑实现类似于cffi.verify所使用的基础结构.
ffibuilder.compile(verbose=True)仅用于调试目的，可能verbose=False在发行版中更有意义.

There is a new extension (so/pyd-file) for every code. We distinguish between them via hash-value of the passed code.
over the time there will be quite some myufuncXXXX.so-files around, one could think about implementing an infrastructure similar to one used by cffi.verify.
ffibuilder.compile(verbose=True) is just for debugging purposes, probably verbose=False makes more sense in release.

这篇关于如何将C函数编译为numpy ufunc并动态加载它?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！