本文介绍了XML XPath搜索和阵列与PHP的循环,内存问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我处理大型XML文件(数兆字节),为此我不得不做出各种实物检查。不过,我有这非常快速增长的内存和时间的使用问题。我测试过这样的:

I'm dealing with large XML files (several megabytes) for which I have to make various kind of checks. However I have problem with memory and time usage which grows very quickly. I've tested it like this:

$xml = new SimpleXMLElement($string);
$sum_of_elements = (double)0.0;

foreach ( $xml->xpath('//Amt') as $amt ) {
  $sum_of_elements += (double)$amt;
}

使用microtime中()和memory_get_usage()-functions我得到这个运行code以下结果:

With microtime() and memory_get_usage() -functions I get the following results by running this code:


  • 5MB的文件(7480 AMT的元素):

    • 执行时间0,69s

    • 内存使用率从10.25Mb成长为29.75Mb

    这仍然是相当好的。但后来有一个大一点的文件记忆和使用时间增长非常

    That's still quite ok. But then with a bit bigger file memory and time usage grow very much:


    • 6MB文件(8976 AMT的元素):

      • 执行时间8,53s

      • 内存使用率从10.25Mb成长为99.25Mb

      这个问题似乎是在循环的结果集。我也试过循环代替的foreach但没有任何区别。如果没有循环内存使用量不会增长这么多。

      The problem seems to be in looping the result set. I've also tried for-loop instead of foreach but with no difference. Without looping the memory usage does not grow so much.

      任何想法,问题可能是什么?

      Any idea where the problem could be?

      推荐答案

      SimpleXML的是基于树的,将整个文件加载到内存中。使用取消设置标记不再需要的的进行清理might产量较少的内存使用。如果那不解决这一问题,可以考虑使用为基于拉做法。虽然你将无法使用XPath,内存的消耗应该是显著下降。

      SimpleXML is tree-based and will load the entire document into memory. Using unset to mark no longer needed resources for PHP's GC for cleanup during a loop might yield less memory usage. If that doesnt solve the issue, consider using XMLReader for a pull-based approach. Though you won't be able to use XPath, memory consumption should be significantly lower.

      这篇关于XML XPath搜索和阵列与PHP的循环,内存问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-31 00:52