问题描述
我正在尝试创建随机矩阵,并使用numpy.save将其保存在二进制文件中
I'm trying to create random matrix and save it in binary file using numpy.save
然后我尝试使用numpy.memmap映射此文件,但似乎映射错误.
Then I try to map this file using numpy.memmap, but it seems it maps it wrong.
如何解决?
似乎它读了.npy标头,我需要从头开始对一些字节进行加密.
It seems it read .npy header and I need to scip some bytes from begining.
rows=6
cols=4
def create_matrix(rows,cols):
data = (np.random.rand(rows,cols)*100).astype('uint8') #type for image [0 255] int8?
return data
def save_matrix(filename, data):
np.save(filename, data)
def load_matrix(filename):
data= np.load(filename)
return data
def test_mult_ram():
A= create_matrix(rows,cols)
A[1][2]= 42
save_matrix("A.npy", A)
A= load_matrix("A.npy")
print A
B= create_matrix(cols,rows)
save_matrix("B.npy", B)
B= load_matrix("B.npy")
print B
fA = np.memmap('A.npy', dtype='uint8', mode='r', shape=(rows,cols))
fB = np.memmap('B.npy', dtype='uint8', mode='r', shape=(cols,rows))
print fA
print fB
更新:
我刚刚发现np.lib.format.open_memmap函数已经存在.
I just found that already np.lib.format.open_memmap function exist.
用法:a = np.lib.format.open_memmap('A.npy',dtype ='uint8',mode ='r +')
usage:a = np.lib.format.open_memmap('A.npy', dtype='uint8', mode='r+')
推荐答案
npy格式具有使用np.memmap
时必须跳过的标头.它以6字节的魔术字符串'\x93NUMPY'
,2字节的版本号开头,然后是2字节的标头长度,然后是标头数据.
The npy format has a header that must be skipped when using np.memmap
. It starts with an 6-byte magic string, '\x93NUMPY'
, 2 byte version number, followed by 2 bytes header length, followed by header data.
因此,如果打开文件,找到标题长度,则可以计算偏移量以传递给np.memmap:
So if you open the file, find the header length, then you can compute the offset to pass to np.memmap:
def load_npy_to_memmap(filename, dtype, shape):
# npy format is documented here
# https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt
with open(filename, 'r') as f:
# skip magic string \x93NUMPY + 2 bytes major/minor version number
# + 2 bytes little-endian unsigned short int
junk, header_len = struct.unpack('<8sh', f.read(10))
data= np.memmap(filename, dtype=dtype, shape=shape, offset=6+2+2+header_len)
return data
import struct
import numpy as np
np.random.seed(1)
rows = 6
cols = 4
def create_matrix(rows, cols):
data = (np.random.rand(
rows, cols) * 100).astype('uint8') # type for image [0 255] int8?
return data
def save_matrix(filename, data):
np.save(filename, data)
def load_matrix(filename):
data= np.load(filename)
return data
def load_npy_to_memmap(filename, dtype, shape):
# npy format is documented here
# https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt
with open(filename, 'r') as f:
# skip magic string \x93NUMPY + 2 bytes major/minor version number
# + 2 bytes little-endian unsigned short int
junk, header_len = struct.unpack('<8sh', f.read(10))
data= np.memmap(filename, dtype=dtype, shape=shape, offset=6+2+2+header_len)
return data
def test_mult_ram():
A = create_matrix(rows, cols)
A[1][2] = 42
save_matrix("A.npy", A)
A = load_matrix("A.npy")
print A
B = create_matrix(cols, rows)
save_matrix("B.npy", B)
B = load_matrix("B.npy")
print B
fA = load_npy_to_memmap('A.npy', dtype='uint8', shape=(rows, cols))
fB = load_npy_to_memmap('B.npy', dtype='uint8', shape=(cols, rows))
print fA
print fB
np.testing.assert_equal(A, fA)
np.testing.assert_equal(B, fB)
test_mult_ram()
这篇关于numpy.memmap映射以保存文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!