问题描述
我有一个大文本文件(〜7 GB).我正在寻找是否存在读取大文本文件的最快方法.我一直在阅读有关使用几种方法逐块读取的方法,以加快处理速度.
i have a large text file (~7 GB). I am looking if exist the fastest way to read large text file. I have been reading about using several approach as read chunk-by-chunk in order to speed the process.
例如 effbot 建议
# File: readline-example-3.py
file = open("sample.txt")
while 1:
lines = file.readlines(100000)
if not lines:
break
for line in lines:
pass # do something**strong text**
以便每秒处理96,900行文本.其他作者建议使用islice ()
in order to process 96,900 lines of text per second.Other authors suggest to use islice()
from itertools import islice
with open(...) as f:
while True:
next_n_lines = list(islice(f, n))
if not next_n_lines:
break
# process next_n_lines
list(islice(f, n))
将返回文件f
的下一个n
行的列表.在循环中使用它会以n
行
list(islice(f, n))
will return a list of the next n
lines of the file f
. Using this inside a loop will give you the file in chunks of n
lines
推荐答案
with open(<FILE>) as FileObj:
for lines in FileObj:
print lines # or do some other thing with the line...
一次将读取一行到内存,并在完成后关闭文件...
will read one line at the time to memory, and close the file when done...
这篇关于Python读取大文本文件(几GB)的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!