本文介绍了有效的“尾巴”实施的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 嗨 我有一个非常大的文件,例如超过200Mb,我将使用 python来编写尾巴代码。 命令获取文件的最后几行。什么是一个好的算法 这个类型的任务在python中用于非常大的文件? 最初,我想把所有内容从文件中读入数组 并且只获得最后几个元素(行),但因为它是一个非常大的 文件,所以不要认为是有效的。 谢谢 解决方案 我不认为这是一个python特定的问题但是一般的问题 allfile as byte stream系统。问题是,线是线。不是该文件的 属性,但其内容(某些大型铁系统使用 记录为行,可以用O(1)来解决) 所以最简单的就是读取和下降直到你想要的那个。 for x in f: 如果x_is_what_I_want:某事 如果你真的想要,你可以这样做反向查询: f.seek(0,EOF) x = f.tell() 然后逐字节循环,直到找到你的东西。这非常麻烦,可能不会更快,具体取决于你的内容。 嗯,200mb并不是那么大这些日子。但它很容易编码: #未经测试的代码 input = open(filename) tail = input。 readlines()[:tailcount] input.close() 你完成了。但是,它会经历大量的记忆。最快的 可能会向后反复,但可能会花费多个尝试获得你想要的一切: #untested代码 输入=打开(文件名) blocksize = tailcount * expected_line_length tail = [] 而len (尾巴)< tailcount: input.seek(-blocksize,EOF) tail = input.read()。split(''\ n'') blocksize * = 2 input.close() tail = tail [:tailcount] 它可能是更有效地向后读取块并将它们粘贴在一起,但是我不打算进入它们。 < mike - Mike Meyer< mw*@mired.org> http://www.mired.org/home/mwm/ 独立的WWW / Perforce / FreeBSD / Unix顾问,电子邮件以获取更多信息。 这实际上是个好主意。只需反转缓冲区并执行 拆分,最后一行成为第一行,依此类推。然后逻辑 与从文件开头读取没有什么不同。只需要 保持最后的半行反向缓冲区如果想要的那个 恰好跨越缓冲区边界。 hi I have a file which is very large eg over 200Mb , and i am going to usepython to code a "tail"command to get the last few lines of the file. What is a good algorithmfor this type of task in python for very big files?Initially, i thought of reading everything into an array from the fileand just get the last few elements (lines) but since it''s a very bigfile, don''t think is efficient.thanks 解决方案 I don''t think this is a python specific issue but a generic problem forall "file as byte stream" system. The problem is, "line" is not aproperty of the file, but its content(some big iron system use"records" for lines and can be addressed with O(1)) So the simplest is just read and drop until the one you want. for x in f:if x_is_what_I_want: something If you really want, you can do the reverse lookup like this : f.seek(0,EOF)x = f.tell() then loop byte by byte backward till you find you stuff. The is quitecumbersome and may not be faster, depending on your content. Well, 200mb isn''t all that big these days. But it''s easy to code: # untested codeinput = open(filename)tail = input.readlines()[:tailcount]input.close() and you''re done. However, it will go through a lot of memory. Fastestis probably working through it backwards, but that may take multipletries to get everything you want: # untested codeinput = open(filename)blocksize = tailcount * expected_line_lengthtail = []while len(tail) < tailcount:input.seek(-blocksize, EOF)tail = input.read().split(''\n'')blocksize *= 2input.close()tail = tail[:tailcount] It would probably be more efficient to read blocks backwards and pastethem together, but I''m not going to get into that. <mike--Mike Meyer <mw*@mired.org>http://www.mired.org/home/mwm/Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.That actually is a pretty good idea. just reverse the buffer and do asplit, the last line becomes the first line and so on. The logic thenwould be no different than reading from beginning of file. Just need tokeep the last "half line" of the reversed buffer if the wanted onehappens to be across buffer boundary. 这篇关于有效的“尾巴”实施的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
10-24 01:44