问题描述
我试图用VS2008 / .NET 3.5使用HTMLAgilityPack。我得到即使我的OptionUseIdAttribute设置为true,但它应该是默认为true此错误
I am trying to use HTMLAgilityPack with VS2008/.Net 3.5. I get this error even if I set the OptionUseIdAttribute to true, though it is supposed to be true by default.
Error Message:
You need to set UseIdAttribute property to true to enable this feature
Stack Trace:
at HtmlAgilityPack.HtmlDocument.GetElementbyId(String id)
我试过1.4.6版和1.4.0,既不工作。
I tried version 1.4.6 and 1.4.0, neither worked.
版本1.4.6 - Net20 / HtmlAgilityPack.dll
Version 1.4.6 - Net20/HtmlAgilityPack.dll
版本1.4.0 - Net20 / HtmlAgilityPack.dll
Version 1.4.0 - Net20/HtmlAgilityPack.dll
这是代码,
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(url);
HtmlNode table = doc.GetElementbyId("tblThreads");
这也不能工作,
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = new HtmlDocument { OptionUseIdAttribute = true };
doc = web.Load(url);
HtmlNode table = doc.GetElementbyId("tblThreads");
我怎样才能解决这个问题?
感谢。
How can I fix this issue?Thanks.
推荐答案
首先,我用的。我导航到的HTMLDocument类,并可以看到的getElementById方法是这样的:
First I used ILSpy on the 1.4.0 HAP Dll. I navigated to the HtmlDocument class and could see that the GetElementById method looks like this:
// HtmlAgilityPack.HtmlDocument
/// <summary>
/// Gets the HTML node with the specified 'id' attribute value.
/// </summary>
/// <param name="id">The attribute id to match. May not be null.</param>
/// <returns>The HTML node with the matching id or null if not found.</returns>
public HtmlNode GetElementbyId(string id)
{
if (id == null)
{
throw new ArgumentNullException("id");
}
if (this._nodesid == null)
{
throw new Exception(HtmlDocument.HtmlExceptionUseIdAttributeFalse);
}
return this._nodesid[id.ToLower()] as HtmlNode;
}
后来我ILSpy分析_nodesid,因为在你的情况下,对于一些原因,它是不被设置。 HtmlDocument.DetectEncoding(TextReader的)和HtmlDocument.Load(TextReader的)分配值_nodesid。
I then got ILSpy to analyze "_nodesid", because in your case for some reason it is not being set. "HtmlDocument.DetectEncoding(TextReader)" and "HtmlDocument.Load(TextReader)" assigns value to "_nodesid".
因此,你可以尝试另一种方法来读取内容从URL借此_nodesid的价值将被明确赋值例如:
Hence you could try an alternative method to read the content from the URL whereby the "_nodesid" value will be definitely assigned e.g.
var doc = new HtmlDocument();
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
using (var response = (HttpWebResponse)request.GetResponse())
{
using (var stream = response.GetResponseStream())
{
doc.Load(stream);
}
}
var table = doc.GetElementbyId("tblThreads");
这方法确保HtmlDocument.Load(TextReader的)之称,并在代码我可以看到_nodesid一定会得到分配,所以这种方法的可能的(我还没有编我建议代码)的工作。
This approach ensures that "HtmlDocument.Load(TextReader)" is called, and in that code I can see that _nodesid will definitely get assigned, so this approach may (I haven't compiled the code I've suggested) work.
这篇关于HTMLAgilityPack - 你需要UseIdAttribute属性设置为true来启用此功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!