本文介绍了使用LINQ使用HtmlAgilityPack解析HTML页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何在网页上使用Linq解析html并将值添加到字符串.我正在Metro应用程序上使用HtmlAgilityPack,想带回3个值并将它们添加到字符串中.
How can i parse html using Linq on a webpage and add values to a string. I am using the HtmlAgilityPack on a metro application and would like to bring back 3 values and add them to a string.
这是网址= http://explorer.litecoin.net/address/Li7x5UZqWUy7o2t1
I would like to get the values from the following see "belwo"
"Balance:","Transactions in","Received"
WebResponse x = await req.GetResponseAsync(); HttpWebResponse res = (HttpWebResponse)x; if (res != null) { if (res.StatusCode == HttpStatusCode.OK) { Stream stream = res.GetResponseStream(); using (StreamReader reader = new StreamReader(stream)) { html = reader.ReadToEnd(); } HtmlDocument htmlDocument = new HtmlDocument(); htmlDocument.LoadHtml(html); string appName = htmlDocument.DocumentNode.Descendants // not sure what t string a = "Name: " + WebUtility.HtmlDecode(appName); } }
推荐答案
请尝试以下操作.您还可以考虑将表格拉开,因为表格的格式比'p'标记中的自由文本要好一些.
// download the site content and create a new html document // NOTE: make this asynchronous etc when considering IO performance var url = "http://explorer.litecoin.net/address/Li7x5UZqWUy7o1tEC2x5o6cNsn2bmDxA2N"; var data = new WebClient().DownloadString(url); var doc = new HtmlDocument(); doc.LoadHtml(data); // extract the transactions 'h3' title, the node we want is directly before it var transTitle = (from h3 in doc.DocumentNode.Descendants("h3") where h3.InnerText.ToLower() == "transactions" select h3).FirstOrDefault(); // tokenise the summary, one line per 'br' element, split each line by the ':' symbol var summary = transTitle.PreviousSibling.PreviousSibling; var tokens = (from row in summary.InnerHtml.Replace("<br>", "|").Split('|') where !string.IsNullOrEmpty(row.Trim()) let line = row.Trim().Split(':') where line.Length == 2 select new { name = line[0].Trim(), value = line[1].Trim() }); // using linqpad to debug, the dump command drops the currect variable to the output tokens.Dump();
'Dump()'是一个LinqPad命令,它将变量转储到控制台,以下是Dump命令的输出示例:
这篇关于使用LINQ使用HtmlAgilityPack解析HTML页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!