本文介绍了让cv2.imread从文件对象或类似内存流的数据(此处为未提取的tar)中读取图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个.tar文件,其中包含数百张图片(.png).我需要通过opencv处理它们.

I have a .tar file containing several hundreds of pictures (.png). I need to process them via opencv.

我想知道出于效率原因是否可以在不经过光盘的情况下进行处理.换句话说,我想从与tar文件相关的内存流中读取图片.

I am wondering whether - for efficiency reasons - it is possible to process them without passing by the disc. In other, words I want to read the pictures from the memory stream related to the tar file.

例如考虑

 import tarfile
 import cv2

 tar0 = tarfile.open('mytar.tar')
 im = cv2.imread( tar0.extractfile('fname.png').read() )

最后一行不起作用,因为imread需要文件名而不是流.

The last line doesn't work as imread expects a file name rather than a stream.

请考虑这种直接从tar流读取的方式可以实现,例如文本(请参见例如此SO问题).

Consider that this way of reading directly from the tar stream can be achieved e.g. for text (see e.g. this SO question).

有没有建议使用正确的png编码打开流?

Any suggestion to open the stream with the correct png encoding?

对ramdisk进行解压缩当然是一个选择,尽管我一直在寻找更多 cachable .

Untarring to ramdisk is of course an option, although I was looking for something more cachable.

推荐答案

感谢@abarry和此答案的建议我设法找到了答案.

Thanks to the suggestion of @abarry and this SO answer I managed to find the answer.

请考虑以下内容

def get_np_array_from_tar_object(tar_extractfl):
     '''converts a buffer from a tar file in np.array'''
     return np.asarray(
        bytearray(tar_extractfl.read())
        , dtype=np.uint8)

tar0 = tarfile.open('mytar.tar')

im0 = cv2.imdecode(
        get_np_array_from_tar_object(tar0.extractfile('fname.png'))
        , 0 )

这篇关于让cv2.imread从文件对象或类似内存流的数据(此处为未提取的tar)中读取图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!