问题描述
Python 文档:https://docs.python.org/2/library/functions.html#open
open(name[, mode[, buffering]])
上述文档说可选的缓冲参数指定文件所需的缓冲区大小:0 表示未缓冲,1 表示行缓冲,任何其他正值表示使用(大约)该大小(以字节为单位)的缓冲区.负缓冲表示使用系统默认值.如果省略,则使用系统默认值.".
当我使用
The above documentation says "The optional buffering argument specifies the file’s desired buffer size: 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size (in bytes). A negative buffering means to use the system default.If omitted, the system default is used.".
When I use
filedata = open(file.txt,"r",0)
或
filedata = open(file.txt,"r",1)
或
filedata = open(file.txt,"r",2)
或
filedata = open(file.txt,"r",-1)
或
filedata = open(file.txt,"r")
输出没有变化.上面显示的每一行都以相同的速度打印.
输出:
The output has no change. Each line shown above prints at same speed.
output:
先生Bean 是一个英国电视连续剧系列十五个 25-
由 Robin Driscoll 编写并由 Rowan Atkinson 主演的分钟剧集作为
minute episodes written by Robin Driscoll and starring Rowan Atkinson as
标题字符.不同的剧集也是罗宾写的
the title character. Different episodes were also written by Robin
德里斯科尔和理查德柯蒂斯,以及本埃尔顿的一个.十三个
Driscoll and Richard Curtis, and one by Ben Elton. Thirteen of the
从 1990 年 1 月 1 日的试播开始,在 ITV 上播出了剧集,直到
episodes were broadcast on ITV, from the pilot on 1 January 1990, until
1995 年 10 月 31 日的豆豆先生晚安".剪辑节目The Best Bits
"Goodnight Mr. Bean" on 31 October 1995. A clip show, "The Best Bits of
先生Bean",于 1995 年 12 月 15 日播出,其中一集头发由
Mr. Bean", was broadcast on 15 December 1995, and one episode, "Hair by
先生伦敦豆",直到 2006 年才播出尼克国际儿童频道.
Mr. Bean of London", was not broadcast until 2006 on Nickelodeon.
那open()函数中的buffering参数怎么用呢?什么价值
Then how the buffering parameter in the open() function is useful? What value
那个缓冲参数最好用?
推荐答案
启用缓冲意味着您不会直接与操作系统的文件表示或其文件系统 API 交互.取而代之的是,从原始操作系统文件流中将一大块数据读取到缓冲区中,直到它被消耗为止,此时更多的数据被提取到缓冲区中.就您获得的对象而言,您将获得一个 BufferedIOBase
对象,该对象包装了一个底层的 RawIOBase
(代表原始文件流).
Enabling buffering means that you're not directly interfacing with the OS's representation of a file, or its file system API. Instead, a chunk of data is read from the raw OS filestream into a buffer until it is consumed, at which point more data is fetched into the buffer. In terms of the objects you get, you'll get a BufferedIOBase
object wrapping an underlying RawIOBase
(which represents the raw file stream).
这样做有什么好处?与原始流的良好接口可能具有很高的延迟,因为操作系统必须在硬盘等物理对象上四处游荡,这可能并非在所有情况下都可以接受.假设您想每 5 毫秒从一个文件中读取三个字母,并且您的文件位于一个老旧的硬盘上,甚至是网络文件系统上.与其每 5ms 尝试从原始文件流中读取一次,不如将一堆字节从文件中加载到内存中的缓冲区中,然后随意使用.
What is the benefit of this? Well interfacing with the raw stream might have high latency, because the operating system has to fool around with physical objects like the hard disk, and this may not be acceptable in all cases. Let's say you want to read three letters from a file every 5ms and your file is on a crusty old hard disk, or even a network file system. Instead of trying to read from the raw filestream every 5ms, it is better to load a bunch of bytes from the file into a buffer in memory, then consume it at will.
您选择的缓冲区大小取决于您使用数据的方式.对于上面的示例,1 个字符的缓冲区大小会很糟糕,3 个字符就可以了,任何不会对您的用户造成明显延迟的 3 个字符的大倍数都是理想的.
What size of buffer you choose will depend on how you're consuming the data. For the example above, a buffer size of 1 char would be awful, 3 chars would be alright, and any large multiple of 3 chars that doesn't cause a noticeable delay for your users would be ideal.
这篇关于python内置的open()函数中的缓冲有什么用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!