问题描述
我从电影网站上读取了以下 HTML 代码:
导演<a href="http://...">Bobby Farrelly</a>、<a href="http://...">Peter Farrelly</a>.与 <a href="http://...>Jim Carrey</a>、<a href="http://...">Jeff Daniels</a>.<div class="红色">第 1 页
我从电影网站上读取了以下 HTML 代码:
导演<a href="http://...">Bobby Farrelly</a>、<a href="http://...">Peter Farrelly</a>.与 <a href="http://...>Jim Carrey</a>、<a href="http://...">Jeff Daniels</a>.<div class="红色">第 1 页
我正在尝试使用 XPath 将导演与演员分开.如您所见
董事是:鲍比法雷利和彼得法雷利
演员是:金凯瑞和杰夫丹尼尔斯
从这种格式错误的 XML 中区分导演和演员的唯一方法是检测字符串.With"并选择 A 标签.
通过使用:
foreach($r as $result) {$tag = $result->getElementsByTagName("a");foreach($tag as $text) {$t = trim(preg_replace("/[\r\n]+/", " ", $text->nodeValue));}}
我可以选择 DIV 和 A 标签内的文本.但这将选择所有 A 标签,为了让导演只需要我只需要选择 A 标签内的文本直到.With"字符串.
一种可能的 xpath :
//div[@class="blue"]/a[following-sibling::text()[contains(., "With")]]
以上 xpath 读取:查找所有 div
其中 class
属性值等于blue".然后从每个这样的 div
中,在包含文本 "With"
的文本节点之前选择所有 标签.
在 xpath tester 中输出:
'<a href="http://...">Bobby Farrelly</a>''<a href="http://...">Peter Farrelly</a>'
I have the following HTML code that I'm reading from a movies web site:
<div class="blue">
Director <a href="http://...">Bobby Farrelly</a>, <a href="http://...">Peter Farrelly</a>. With <a href="http://...>Jim Carrey</a>, <a href="http://...">Jeff Daniels</a>.
<div class="red">
page 1
</div>
</div>
I'm trying to separate the director(s) from the actors usign XPath. As you may see
directors are: Bobby Farrelly and Peter Farrelly
actors are: Jim Carrey and Jeff Daniels
The only way to distinguish directors from actors from this bad formed XML is detecting the string ". With" and selecting the A tags up to it.
By using:
foreach($r as $result) {
$tag = $result->getElementsByTagName("a");
foreach($tag as $text) {
$t = trim(preg_replace("/[\r\n]+/", " ", $text->nodeValue));
}
}
I can select the DIV and the text inside the A tags. But this will select ALL the A tags, to get the directors only I need to select only the text inside the A tags up to the ". With" string.
One possible xpath :
//div[@class="blue"]/a[following-sibling::text()[contains(., "With")]]
Above xpath reads: find all div
where class
attribute value equals "blue". Then from within each of such div
, select all <a>
tag before text node containing text "With"
.
Output in xpath tester :
'<a href="http://...">Bobby Farrelly</a>'
'<a href="http://...">Peter Farrelly</a>'
这篇关于XPath.选择“A"标签文本但仅限于特定文本值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!