问题描述
我有以下代码,但显然这不是真正的流媒体.这是我能找到的最好的,但它首先将整个输入文件读入内存.我想在解密巨大(> 100Gb 文件)时将它流式传输到 tarfile 模块而不使用我的所有内存
I have the following code but obviously this is not real streaming. It is the best I could find but it reads the whole input file into memory first. I want to stream it to tarfile module without using all my memory when decrypting huge (>100Gb files)
import tarfile, gnupg
gpg = gnupg.GPG(gnupghome='C:/Users/niels/.gnupg')
with open('103330-013.tar.gpg', 'r') as input_file:
decrypted_data = gpg.decrypt(input_file.read(), passphrase='aaa')
# decrypted_data.data contains the data
decrypted_stream = io.BytesIO(decrypted_data.data)
tar = tarfile.open(decrypted_stream, mode='r|')
tar.extractall()
tar.close()
推荐答案
显然,您不能使用 gpnupg 模块使用真正的流式传输,gnupg 模块总是将 gnupg 的整个输出读取到内存中.所以要使用真正的流媒体,你必须直接运行 gpg 程序.这是一个示例代码(没有正确的错误处理):
Apparently, you cannot use real streaming using gpnupg module, gnupg module always reads whole output of gnupg into memory.So to use real streaming, you'll have to run gpg program directly.Here is a sample code (without proper error handling):
import subprocess
import tarfile
with open('103330-013.tar.gpg', 'r') as input_file:
gpg = subprocess.Popen(("gpg", "--decrypt", "--homedir", 'C:/Users/niels/.gnupg', '--passphrase', 'aaa'), stdin=input_file, stdout=subprocess.PIPE)
tar = tarfile.open(fileobj=gpg.stdout, mode="r|")
tar.extractall()
tar.close()
这篇关于管道/流 gnupg 输出到 tarfile的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!