使用php从(巨大的)xml文件中提取数据

本文介绍了使用php从(巨大的)xml文件中提取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 .xml 文件(原始文件超过 7000 行，但为了测试我一直使用只有 3 行的文件，以防大小是我的问题，但事实并非如此)我想从中提取数据.然而，作为一个自动生成的东西，它并不是非常漂亮，请注意:

I have a .xml file (original with over 7000 lines, but for testing I have been using one with just 3 lines, in case the size was my problem, but it wasn't) from which I would like to extract the data. However, being an automatically generated thing, it's not terribly pretty, observe:

<ROW MODID="182" RECORDID="561">
<COL>
<DATA>
</DATA>
</COL>
<COL>
<DATA>
6 quai St Pierre</DATA>
</COL>
<COL>
<DATA>
</DATA>
</COL>
<COL>
<DATA>
Monsieur</DATA>
</COL>
<COL>

等等等等...我已经设计了我需要在 Xacobeo 上运行的请求，但我似乎无法让它与 php 一起工作.我尝试了多种变体，最后一个如下:

etc, etc...I have already devised the requests I need to run on Xacobeo, but I can't seem to get it to work with php. I've tried a multitude of variations, the last one being the following:

$xmldoc = new DOMDocument();
$xmldoc->load('hellashort.xml');
$xpathvar = new Domxpath($xmldoc);
$queryResult = $xpathvar->query('//COL');
foreach($queryResult as $result){
    echo $result->textContent;
}

我用更美观的文件尝试了这个代码并且它有效，那么我怎样才能让它与这个文件一起工作?所有建议表示赞赏.谢谢

I tried this code with more aesthetically pleasing files and it works, so how can I get it working with this file?All suggestions appreciated.Thanks

更新我检查了短文件是否有错误，并意识到我没有关闭一个元素，所以一个元素可以工作，但是根据在线检查器，长文件没有包含错误，但仍然无法正常工作.

UPDATEI checked the short file for errors and realised I hadn't closed an element, so that one works, however the long file contains no errors according to the online checker but still doesn't work.

更新 2长文件现在适用于请求 / 但一旦它们变得更复杂就不会返回任何内容，即: //ROW/COL[position()=39]/DATA在 Xacobeo 中返回正确的结果..xml 文件是否可能太大而无法以这种方式处理?(这个文件大约是 11.2 Mo)

UPDATE 2The long file now works for the request / but returns nothing as soon as they get more complicated, ie: //ROW/COL[position()=39]/DATA which return correct results in Xacobeo.Is it possible for a .xml file to be too big to be handled this way? (This file is about 11.2 Mo)

更新 3 - 已修复所以我改变了我的方法并最终这样做了:

UPDATE 3 - FIXEDSo I changed my approached and ended up doing it this way:

$file=file_get_contents("go.xml");
$xml=simplexml_load_string($file);
$elements=$xml->path('//ROW/COL[position()=1]/DATA');

我明白为什么它被称为 simpleXML，不过感谢所有的帮助

I see why it's called simpleXML, thanks for all the help though

推荐答案

确保您的文档只有一个根元素，即一个包含所有元素的元素:

Make sure your document has exactly one root element, i.e. an element to contain all the <ROW> elements:

<DOCUMENT>
<ROW MODID="182" RECORDID="561">
<COL>
<DATA>
</DATA>
</COL>
<COL>
<DATA>
6 quai St Pierre</DATA>
</COL>
<COL>
<DATA>
<!-- ... -->
</DOCUMENT>

如果您有多个没有根的行，则它不是格式良好的 XML 文件，并且会失败:

If you have multiple rows without a root, it’s not a well-formet XML file and it will fail:

<ROW MODID="182" RECORDID="561">
<COL>
<DATA>
</DATA>
</COL>
<COL>
<!-- ... -->
</ROW>
<ROW MODID="183" RECORDID="562">
<!-- ... -->
</ROW>

这篇关于使用php从(巨大的)xml文件中提取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！