本文介绍了与HTML敏捷包解析HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我要收集所有标签从这个DIV,但不知道如何使用XPath方法
<$ C $做到这一点的最佳途径C>< DIV CLASS =biz_info>
将; H3>&下; A HREF =/ PROFIL / 78122 /秒%C3%B8rby-康复/>索尔比康复&下; / A>&下; / H3>
<表类=string_14>
<&TBODY GT;
< TR>
< TD> Postadr:< / TD>
< TD类=tab_space> Rognerudveien 8 B,0681奥斯陆< / TD>
< / TR>
< TR>
< TD>电话:< / TD>
< TD类=tab_space> 928 70 700℃; / TD>
< / TR>
< TR>
< TD> Nettside:LT; / TD>
< TD类=tab_space>< A HREF =http://www.sorby-rehab.no目标=_空白> www.sorby-rehab.no< / A> < / TD>
< / TR>
< / TBODY>
< /表>
< / DIV>
今天,我的代码看起来像这样(但是非常糟糕):
的HTMLDocument DOC =新的HTMLDocument();
doc.Load(新StringReader(结果));
HtmlNode根= doc.DocumentNode;
名单,LT;字符串> anchorTags =新的List<串GT;();
的foreach(在root.SelectNodes HtmlNode链接(// @类= biz_info))
{
串ATT = link.OuterHtml;
anchorTags.Add(ATT);
}
是谁的人在XPath是专业,可以帮助我?
解决方案
的HTMLDocument HTML =新的HTMLDocument();
html.Load(新StringReader(结果));
VAR anchorTags = html.DocumentNode.SelectNodes(// DIV [@类='biz_info'] // A)
。选择(A => a.OuterHtml)
。了ToList();
这会给你的锚标记HTML列表。如果你只需要网址:
网址= html.DocumentNode.SelectNodes(// DIV [@类='biz_info'] //一个[@href =''!]。)
。选择(A => a.Attributes [HREF]值)
.ToList();
I want to collect all tags in from this div but do not know how to do this in the best way with xpath method
<div class="biz_info">
<h3><a href="/profil/78122/s%C3%B8rby-rehab/">Sørby Rehab</a></h3>
<table class="string_14">
<tbody>
<tr>
<td>Postadr.:</td>
<td class="tab_space">Rognerudveien 8 B, 0681 Oslo</td>
</tr>
<tr>
<td>Telefon:</td>
<td class="tab_space">928 70 700</td>
</tr>
<tr>
<td>Nettside:</td>
<td class="tab_space"><a href="http://www.sorby-rehab.no" target="_blank">www.sorby-rehab.no</a></td>
</tr>
</tbody>
</table>
</div>
Today my code looks like this (but very bad):
HtmlDocument doc = new HtmlDocument();
doc.Load(new StringReader(result));
HtmlNode root = doc.DocumentNode;
List<string> anchorTags = new List<string>();
foreach (HtmlNode link in root.SelectNodes("//@class=biz_info"))
{
string att = link.OuterHtml;
anchorTags.Add(att);
}
Is someone who is professional in xpath that can help me?
解决方案
HtmlDocument html = new HtmlDocument();
html.Load(new StringReader(result));
var anchorTags = html.DocumentNode.SelectNodes("//div[@class='biz_info']//a")
.Select(a => a.OuterHtml)
.ToList();
That will give you list of anchor tags html. If you need just urls:
urls = html.DocumentNode.SelectNodes("//div[@class='biz_info']//a[@href!='']")
.Select(a => a.Attributes["href"].Value)
.ToList();
这篇关于与HTML敏捷包解析HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!