问题描述
我需要使用命名数据块来组织数据文件.数据是NUMPY个数组.但是我不想使用numpy.save或numpy.savez函数,因为在某些情况下,数据必须通过管道或其他接口在服务器上发送.因此,我想将numpy数组转储到内存中,将其压缩,然后将其发送到服务器中.
I need to organized a data file with chunks of named data. Data is NUMPY arrays. But I don't want to use numpy.save or numpy.savez function, because in some cases, data have to be sent on a server over a pipe or other interface. So I want to dump numpy array into memory, zip it, and then, send it into a server.
我尝试过像这样的简单泡菜:
I've tried simple pickle, like this:
try:
import cPickle as pkl
except:
import pickle as pkl
import ziplib
import numpy as np
def send_to_db(data, compress=5):
send( zlib.compress(pkl.dumps(data),compress) )
..但这是非常缓慢的过程.
.. but this is extremely slow process.
即使压缩级别为0(不进行压缩),该过程也非常缓慢,并且仅由于酸洗.
Even with compress level 0 (without compression), the process is very slow and just because of pickling.
有什么方法可以将numpy数组转储为不带泡菜的字符串吗?我知道numpy允许获取缓冲区 numpy. getbuffer ,但是对我而言,如何使用这个转储的缓冲区来获取数组并不清楚.
Is there any way to dump numpy array into string without pickle? I know that numpy allows to get buffer numpy.getbuffer, but it isn't obvious to me, how to use this dumped buffer to obtaine an array back.
推荐答案
您绝对应该使用numpy.save
,您仍然可以在内存中使用它:
You should definitely use numpy.save
, you can still do it in-memory:
>>> import io
>>> import numpy as np
>>> import zlib
>>> f = io.BytesIO()
>>> arr = np.random.rand(100, 100)
>>> np.save(f, arr)
>>> compressed = zlib.compress(f.getvalue())
要解压缩,请逆向执行此过程:
And to decompress, reverse the process:
>>> np.load(io.BytesIO(zlib.decompress(compressed)))
array([[ 0.80881898, 0.50553303, 0.03859795, ..., 0.05850996,
0.9174782 , 0.48671767],
[ 0.79715979, 0.81465744, 0.93529834, ..., 0.53577085,
0.59098735, 0.22716425],
[ 0.49570713, 0.09599001, 0.74023709, ..., 0.85172897,
0.05066641, 0.10364143],
...,
[ 0.89720137, 0.60616688, 0.62966729, ..., 0.6206728 ,
0.96160519, 0.69746633],
[ 0.59276237, 0.71586014, 0.35959289, ..., 0.46977027,
0.46586237, 0.10949621],
[ 0.8075795 , 0.70107856, 0.81389246, ..., 0.92068768,
0.38013495, 0.21489793]])
>>>
如您所见,哪个与我们之前保存的内容匹配:
Which, as you can see, matches what we saved earlier:
>>> arr
array([[ 0.80881898, 0.50553303, 0.03859795, ..., 0.05850996,
0.9174782 , 0.48671767],
[ 0.79715979, 0.81465744, 0.93529834, ..., 0.53577085,
0.59098735, 0.22716425],
[ 0.49570713, 0.09599001, 0.74023709, ..., 0.85172897,
0.05066641, 0.10364143],
...,
[ 0.89720137, 0.60616688, 0.62966729, ..., 0.6206728 ,
0.96160519, 0.69746633],
[ 0.59276237, 0.71586014, 0.35959289, ..., 0.46977027,
0.46586237, 0.10949621],
[ 0.8075795 , 0.70107856, 0.81389246, ..., 0.92068768,
0.38013495, 0.21489793]])
>>>
这篇关于将numpy数组转储为字符串的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!