二进制流中`open`和`io.BytesIO`的区别

本文介绍了二进制流中`open`和`io.BytesIO`的区别的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在学习在 Python 中使用流，我注意到 IO 文档说以下内容:

I'm learning about working with streams in Python and I noticed that the IO docs say the following:

创建二进制流的最简单方法是在模式字符串中使用 open() 和 'b':

f = open("myfile.jpg", "rb")

内存中的二进制流也可用作 BytesIO 对象:

In-memory binary streams are also available as BytesIO objects:

f = io.BytesIO(b"一些初始二进制数据:x00x01")

open 定义的 f 和 BytesIO 定义的 f 有什么区别.换句话说，是什么使内存中的二进制流"成为可能?这与 open 的作用有何不同?

What is the difference between f as defined by open and f as defined by BytesIO. In other words, what makes a "In-memory binary stream" and how is that different from what open does?

推荐答案

为了简单起见，让我们暂时考虑写作而不是阅读.

For simplicity's sake, let's consider writing instead of reading for now.

所以当你使用 open() 就像说:

So when you use open() like say:

with open("test.dat", "wb") as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

执行后将创建一个名为 test.dat 的文件，其中包含 3 个 Hello World.数据写入文件后不会保存在内存中(除非通过名称保存).

After executing that a file called test.dat will be created, containing 3x Hello World. The data wont be kept in memory after it's written to the file (unless being kept by a name).

现在当你考虑 io.BytesIO() 时:

with io.BytesIO() as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

它不是将内容写入文件，而是写入内存缓冲区.换句话说，一块内存.基本上编写以下内容是等效的:

Which instead of writing the contents to a file, it's written to an in memory buffer. In other words a chunk of RAM. Essentially writing the following would be the equivalent:

buffer = b""
buffer += b"Hello World"
buffer += b"Hello World"
buffer += b"Hello World"

关于带有 with 语句的例子，最后还有一个 del 缓冲区.

这里的主要区别在于优化和性能.io.BytesIO 能够进行一些优化，使其比简单地将所有 b"Hello World" 一个一个连接起来更快.

The key difference here is optimization and performance. io.BytesIO is able to do some optimizations that makes it faster than simply concatenating all the b"Hello World" one by one.

为了证明这一点，这里有一个小基准:

Just to prove it here's a small benchmark:

连续:1.3529 秒
BytesIO:0.0090 秒

import io
import time

begin = time.time()
buffer = b""
for i in range(0, 50000):
    buffer += b"Hello World"
end = time.time()
seconds = end - begin
print("Concat:", seconds)

begin = time.time()
buffer = io.BytesIO()
for i in range(0, 50000):
    buffer.write(b"Hello World")
end = time.time()
seconds = end - begin
print("BytesIO:", seconds)

除了性能提升之外，使用 BytesIO 而不是连接的优点是 BytesIO 可以用来代替文件对象.因此，假设您有一个需要写入文件对象的函数.然后你可以给它那个内存缓冲区而不是一个文件.

Besides the performance gain, using BytesIO instead of concatenating has the advantage that BytesIO can be used in place of a file object. So say you have a function that expects a file object to write to. Then you can give it that in-memory buffer instead of a file.

区别在于open("myfile.jpg", "rb") 只是加载并返回myfile.jpg 的内容；而 BytesIO 再次只是一个包含一些数据的缓冲区.

The difference is that open("myfile.jpg", "rb") simply loads and returns the contents of myfile.jpg; whereas, BytesIO again is just a buffer containing some data.

由于 BytesIO 只是一个缓冲区 - 如果您想稍后将内容写入文件 - 您必须这样做:

Since BytesIO is just a buffer - if you wanted to write the contents to a file later - you'd have to do:

buffer = io.BytesIO()
# ...
with open("test.dat", "wb") as f:
    f.write(buffer.getvalue())

另外，你没有提到版本；我正在使用 Python 3.与示例相关:我使用 with 语句而不是调用 f.close()

这篇关于二进制流中`open`和`io.BytesIO`的区别的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！