本文介绍了使用numpy.fromfile加载每个第n个元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想使用np.fromfile
从二进制文件创建一个numpy数组.该文件包含一个3D数组,我只关心每帧中的某个单元格.
I want to create a numpy array from a binary file using np.fromfile
. The file contains a 3D array, and I'm only concerned with a certain cell in each frame.
x = np.fromfile(file, dtype='int32', count=width*height*frames)
vals = x[5::width*height]
以上代码从理论上讲是可以工作的,但是我的文件很大,将其全部读取到x
中会导致内存错误.有没有办法使用fromfile
仅使vals
开始?
The code above would work in theory, but my file is very large and reading it all into x
causes memory errors. Is there a way to use fromfile
to only get vals
to begin with?
推荐答案
这可能效率极低,但是可以起作用:
This may be horribly inefficient but it works:
import numpy as np
def read_in_chunks(fn, offset, step, steps_per_chunk, dtype=np.int32):
out = []
fd = open(fn, 'br')
while True:
chunk = (np.fromfile(fd, dtype=dtype, count=steps_per_chunk*step)
[offset::step])
if chunk.size==0:
break
out.append(chunk)
return np.r_[tuple(out)]
x = np.arange(100000)
x.tofile('test.bin')
b = read_in_chunks('test.bin', 2, 100, 6, int)
print(b)
更新:
这里是使用seek
跳过不需要的内容的人.它对我有用,但完全没有得到检验.
Here's one that uses seek
to skip over the unwanted stuff. It works for me, but is totally undertested.
def skip_load(fn, offset, step, dtype=np.float, n = 10**100):
elsize = np.dtype(dtype).itemsize
step *= elsize
offset *= elsize
fd = open(fn, 'rb') if isinstance(fn, str) else fn
out = []
pos = fd.tell()
target = ((pos - offset - 1) // step + 1) * step + offset
fd.seek(target)
while n > 0:
if (fd.tell() != target):
return np.frombuffer(b"".join(out), dtype=dtype)
out.append(fd.read(elsize))
n -= 1
if len(out[-1]) < elsize:
return np.frombuffer(b"".join(out[:-1]), dtype=dtype)
target += step
fd.seek(target)
return np.frombuffer(b"".join(out), dtype=dtype)
这篇关于使用numpy.fromfile加载每个第n个元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!