问题描述
在我的PHP应用程序中,我需要从结尾处读取多行许多文件(主要是日志).有时我只需要最后一个,有时我需要数十或数百.基本上,我想要像Unix tail
一样灵活的东西.命令.
In my PHP application I need to read multiple lines starting from the end ofmany files (mostly logs). Sometimes I need only the last one, sometimes I needtens or hundreds. Basically, I want something as flexible as the Unix tail
command.
这里存在一些有关如何从文件中获取最后一行的问题(但我需要 N 行),并给出了不同的解决方案.我不确定哪个一个是最好的,并且表现更好.
There are questions here about how to get the single last line from a file (butI need N lines), and different solutions were given. I'm not sure about whichone is the best and which performs better.
推荐答案
方法概述
在互联网上搜索时,我遇到了不同的解决方案.我可以将它们分组三种方法:
Methods overview
Searching on the internet, I came across different solutions. I can group themin three approaches:
- 使用
- 天真用户; 在系统上运行
- 作弊程序;
- 全能使用
fseek()
高兴地在打开的文件中跳转.
file()
PHP函数的tail
命令的- naive ones that use
file()
PHP function; - cheating ones that runs
tail
command on the system; - mighty ones that happily jump around an opened file using
fseek()
.
我最终选择了(或写出了)五个解决方案,一个天真一个,一个作弊一个和三个强大的.
I ended up choosing (or writing) five solutions, a naive one, a cheating oneand three mighty ones.
- 最简洁的 天真解决方案 ,使用内置的数组函数.
- 仅基于
tail
命令的可能解决方案,有点大问题:如果tail
不可用,则它不会运行,即非Unix(Windows)或在不允许系统的受限环境中功能. - 从文件搜索的末尾读取单个字节的解决方案(并计数)换行符,请 此处 .
- 针对大型文件进行了优化的多字节缓冲解决方案 此处
- 略微解决方案#4的修改版本 ,其中缓冲区长度为动态,根据要检索的行数决定.
- The most concise naive solution,using built-in array functions.
- The only possible solution based on
tail
command, which hasa little big problem: it does not run iftail
is not available, i.e. onnon-Unix (Windows) or on restricted environments that don't allow systemfunctions. - The solution in which single bytes are read from the end of file searchingfor (and counting) new-line characters, found here.
- The multi-byte buffered solution optimized for large files, foundhere.
- A slightly modified version of solution #4 in which buffer length isdynamic, decided according to the number of lines to retrieve.
所有解决方案有效.从某种意义上说,他们从我们要求的任何文件和任何数量的行(解决方案#1除外,如果文件较大,则打破PHP内存限制,不返回任何内容).但是哪一个更好吗?
All solutions work. In the sense that they return the expected result fromany file and for any number of lines we ask for (except for solution #1, that canbreak PHP memory limits in case of large files, returning nothing). But which oneis better?
要回答我运行测试的问题.这些事情就是这样完成的,不是吗?
To answer the question I run tests. That's how these thing are done, isn't it?
我准备了一个示例 100 KB文件,将位于我的/var/log
目录.然后,我编写了一个PHP脚本,该脚本使用检索 1、2,..,10、20,... 100、200,...,1000 行的五个解决方案从文件末尾开始.每个测试重复十次(即类似于 5×28×10 = 1400 测试),测量经过的平均值时间(以微秒为单位).
I prepared a sample 100 KB file joining together different files found inmy /var/log
directory. Then I wrote a PHP script that uses each one of thefive solutions to retrieve 1, 2, .., 10, 20, ... 100, 200, ..., 1000 linesfrom the end of the file. Each single test is repeated ten times (that'ssomething like 5 × 28 × 10 = 1400 tests), measuring average elapsedtime in microseconds.
我在本地开发计算机(Xubuntu 12.04,使用PHP命令行的PHP 5.3.10、2.70 GHz双核CPU,2 GB RAM)口译员.结果如下:
I run the script on my local development machine (Xubuntu 12.04,PHP 5.3.10, 2.70 GHz dual core CPU, 2 GB RAM) using the PHP command lineinterpreter. Here are the results:
解决方案#1和#2似乎更糟.解决方案3只有在需要时才是好方法读几行. 解决方案#4和#5似乎是最好的.注意动态缓冲区大小如何优化算法:执行时间很短由于减少了缓冲区,因此几行较小.
Solution #1 and #2 seem to be the worse ones. Solution #3 is good only when we need toread a few lines. Solutions #4 and #5 seem to be the best ones.Note how dynamic buffer size can optimize the algorithm: execution time is a littlesmaller for few lines, because of the reduced buffer.
让我们尝试使用更大的文件.如果我们必须读取 10 MB 日志文件怎么办?
Let's try with a bigger file. What if we have to read a 10 MB log file?
现在,解决方案1更为糟糕:实际上,加载整个10 MB文件记入内存并不是一个好主意.我也在1MB和100MB的文件上运行测试,几乎是相同的情况.
Now solution #1 is by far the worse one: in fact, loading the whole 10 MB fileinto memory is not a great idea. I run the tests also on 1MB and 100MB file,and it's practically the same situation.
还有小的日志文件?这是 10 KB 文件的图形:
And for tiny log files? That's the graph for a 10 KB file:
解决方案#1是目前最好的解决方案!将10 KB加载到内存中没什么大不了的对于PHP. #4和#5的表现也不错.但是,这是一个极端情况:10 KB日志表示大约150/200行...
Solution #1 is the best one now! Loading a 10 KB into memory isn't a big dealfor PHP. Also #4 and #5 performs good. However this is an edge case: a 10 KB logmeans something like 150/200 lines...
最终想法
强烈建议解决方案#5 :有效伟大的每个文件都有大小,读取几行时效果特别好.
Final thoughts
Solution #5 is heavily recommended for the general use case: works greatwith every file size and performs particularly good when reading a few lines.
避免 解决方案#1 应该读取大于10 KB的文件.
Avoid solution #1 if youshould read files bigger than 10 KB.
解决方案 #2 和 #3 并不是我进行的每个测试的最佳选择:#2的运行时间不得少于2ms,而#3受数量的影响很大您要求的行数(仅使用1或2行就可以了.)
Solution #2and #3aren't the best ones for each test I run: #2 never runs in less than2ms, and #3 is heavily influenced by the number oflines you ask (works quite good only with 1 or 2 lines).
这篇关于PHP从文件中读取最后几行的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!