问题描述
我是Cython的新手,对C的经验很少,所以请多多包涵.
I am new to Cython and have very little experience with C so bear with me.
我想存储固定大小的不可变字节对象序列.该对象看起来像:
I want to store a fixed-size sequence of immutable byte objects. The object would look like:
obj = (b'abc', b'1234', b'^&$#%')
元组中的元素是不可变的,但是它们的长度是任意的.
The elements in the tuple are immutable, but their length is arbitrary.
我尝试过的方式类似于:
What I tried was something along the lines of:
cdef char[3] *obj
cdef char* a, b, c
a = b'abc'
b = b'1234'
c = b'^&$#%'
obj = (a, b, c)
但是我得到:
Storing unsafe C derivative of temporary Python reference
有人可以指出正确的方向吗?
Can someone please point me in the right direction?
奖金问题:如何输入这些三元组的任意长序列?
Bonus question: how do I type an arbitrarily long sequence of those 3-tuples?
谢谢!
推荐答案
您肯定很亲密!似乎有两个问题.
You are definitely close! There appears to be two issues.
首先,我们需要更改 obj
的声明,以便它读取到我们正在尝试创建 char *
对象的数组,并将其大小固定为3为此,您需要先输入类型,然后输入变量名,然后再输入数组的大小.这将为您提供堆栈上所需的 char *
数组.
First, we need to change the declaration of obj
so that it reads that we are trying to create an array of char*
objects, fixed to a size of 3. To do this, you need to put the type, then the variable name, and only then the size of the array. This will give you the desired array of char*
on the stack.
第二,当您声明 char * a,b,c
时,只有 a
是 char *
,而 b
和 c
只是 char
!cython在编译阶段对此进行了明确说明,该代码对我输出以下警告:
Second, when you declare char* a, b, c
, only a
is a char*
, while b
and c
are just char
! This is made clear in cython during the compilation phase, which outputs the following warning for me:
Non-trivial type declarators in shared declaration (e.g. mix of pointers and values). Each pointer declaration should be on its own line.
因此,您应该改为这样做:
So you should do this instead:
cdef char* obj[3]
cdef char* a
cdef char* b
cdef char* c
a = b'abc'
b = b'1234'
c = b'^&$#%'
obj = [a, b, c]
作为旁注,您可以通过对代码执行以下操作来最小化键入 cdef
:
As a side note, you can minimize typing cdef
by doing this for your code:
cdef:
char* obj[3]
char* a
char* b
char* c
a = b'abc'
b = b'1234'
c = b'^&$#%'
obj = [a, b, c]
奖金:
根据您对C和指针的总体经验水平,我认为我将仅展示使用C ++数据结构的更适合新手的方法.C ++具有简单的内置数据结构,例如 vector
,相当于python列表.C的替代方法是拥有一个指向结构的指针,该结构表示 triplet
的数组".然后,您将亲自使用 malloc
, free
, realloc
等功能来管理此内存.
Based on your level of experience with C and pointers in general, I think I will just show the more newbie-friendly approach using C++ data structures. C++ has simple built-in data structures like vector
, which is the equivalent of a python list. The C alternative would be to have a pointer to a struct, signifying an "array" of triplets
. You would then be personally in charge of managing the memory of this using functions like malloc
, free
, realloc
, etc.
这里有一些可以帮助您入门的东西;我强烈建议您自己遵循一些在线C或C ++教程,并使它们适应cython,经过一些练习后,这应该是相当琐碎的.我同时显示了 test.pyx
文件和 setup.py
文件,该文件显示了如何在c ++支持下进行编译.
Here is something to get you started; I strongly suggest you follow some online C or C++ tutorials on your own and adapt them to cython, which should be fairly trivial after some practice. I am showing both a test.pyx
file as well as the setup.py
file that shows how you can compile this with c++ support.
test.pyx
from libcpp.vector cimport vector
"""
While it can be discouraged to mix raw char* C "strings" wth C++ data types,
the code here is pretty simple.
Fixed arrays cannot be used directly for vector type, so we use a struct.
Ideally, you would use an std::array of std::string, but std::array does not
exist in cython's libcpp. It should be easy to add support for this with an
extern statement though (out of the scope of this mini-tutorial).
"""
ctypedef struct triplet:
char* data[3]
cdef:
vector[triplet] obj
triplet abc
triplet xyz
abc.data = ["abc", "1234", "^&$#%"]
xyz.data = ["xyz", "5678", "%#$&^"]
obj.push_back(abc)#pretty much like python's list.append
obj.push_back(xyz)
"""
Loops through the vector.
Cython can automagically print structs so long as their members can be
converted trivially to python types.
"""
for o in obj:
print(o)
setup.py
from distutils.core import setup
from Cython.Build import cythonize
from distutils.core import Extension
def create_extension(ext_name):
global language, libs, args, link_args
path_parts = ext_name.split(".")
path = "./{0}.pyx".format("/".join(path_parts))
ext = Extension(ext_name, sources=[path], libraries=libs, language=language,
extra_compile_args=args, extra_link_args=link_args)
return ext
if __name__ == "__main__":
libs = []#no external c libraries in this case
language = "c++"#chooses c++ rather than c since STL is used
args = ["-w", "-O3", "-ffast-math", "-march=native", "-fopenmp"]#assumes gcc is the compiler
link_args = ["-fopenmp"]#none here, could use -fopenmp for parallel code
annotate = True#autogenerates .html files per .pyx
directives = {#saves typing @cython decorators and applies them globally
"boundscheck": False,
"wraparound": False,
"initializedcheck": False,
"cdivision": True,
"nonecheck": False,
}
ext_names = [
"test",
]
extensions = [create_extension(ext_name) for ext_name in ext_names]
setup(ext_modules = cythonize(
extensions,
annotate=annotate,
compiler_directives=directives,
)
)
这篇关于Cython中固定大小的字节串序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!