用DOMDocument解析html | is

is

为什么std :: bitset的位以相反的顺序？

用什么工具来写文档？

为什么不能读取当前日期

Macports选择默认的Python解释器来执行脚本吗?

从数据库中检索方法名称，并将该方法用作委托

从 chrome 扩展访问 iframe

TypeScript typeof 函数返回值

301重定向任何不存在的图像，PDF或URL使用的.htaccess

做外部库让应用程序慢？

按 ID 加载单个视频的 YouTube GData 供稿

说路径以URI表示，网络地址以URL表示是正确的吗?

如何在更新面板中设置ASP标签控件的焦点

为什么 ('b'+'a'+ + 'a' + 'a').toLowerCase() 'banana' 的结果是?

boost :: property_tree :: info_parser断开值的空格

如何在Python中实现树？在Python中是否有任何内置的数据结构？

用DOMDocument解析html

扫码查看

本文介绍了用DOMDocument解析html的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用php中的DOMDocument解析html。

I'm parsing html with DOMDocument in php.

我发现我无法使用xpath查询来全部选择。但是，getElementsByTagName（）方法可以正常工作。

I found I'm unable to select all using an xpath query. However the getElementsByTagName() method works fine.

这里是代码：

$xml = new DOMDocument();
$xml->load("file.html");
$xpath = new DOMXPath($xml);

$links = $xpath->query("//a");
$links2 = $xml->getElementsByTagName("a");

foreach($links as $link){
    echo "<br>$k: ".$link->nodeValue; // this doesn't print the node value. $links is empty
}
foreach($links2 as $link){
    echo "<br>$k: ".$link->nodeValue; // this prints OK the node value
}

我本以为xpath-> query（ // a）与getElementsByTagname（ a）相同，但是显然不一样。

I'd have thought xpath->query("//a") would be the same as getElementsByTagname("a") but apparently isn't.

有人可以告诉我为什么它们不是相同。或者如果是，使用xpath查询选择节点时我做错了什么？

Could anybody tell me why they aren't the same. Or if they are, what am I doing wrong to select the nodes using the xpath query?

谢谢

推荐答案

无法复制：

如果您想使用 load 或 loadXML 是有效的X（HT）ML。 HTML基于SGML。尝试使用 loadHTML 或 loadHTMLFile 。

If you want to use load or loadXML your markup has to be valid X(HT)ML. HTML is SGML based. Try with loadHTML or loadHTMLFile.

请注意，当您使用 loadHTML 或 loadHTMLFile 时，DOM将尝试修复任何无效的HTML，使其对DOM适用。例如，它将在所有部分HTML文档周围添加一个基本的HTML框架，这可能会对您的XPath查询产生影响（尽管在 \\a 情况下不会））。

Note that when you use loadHTML or loadHTMLFile, DOM will try to repair any invalid HTML to an extent that it is workable for DOM. For instance, it will add a basic HTML skeleton around any partial HTML documents and that can have an effect on your XPath queries (not in the case of \\a though).

这篇关于用DOMDocument解析html的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

09-05 13:15