问题描述
我正试图从"https://www.sideshow.com/collectibles?manufacturer=Hot+Toys"获取信息.专门针对Div c-ProductList行以ss为目标,但似乎未检索到任何信息,任何线索
I am attempting to get information from "https://www.sideshow.com/collectibles?manufacturer=Hot+Toys"specifically Div c-ProductList row ss-targeted but no information seems to be retrieved, any clues
var test = page.DocumentNode.SelectNodes("//div[@class='c-ProductList row ss-targeted']");
推荐答案
要获取的内容是在页面加载后使用Javascript和Ajax生成的.HAP无法获取它,除非它在后台运行浏览器并执行页面上的脚本.
The content you want to get is generated after the page loads, using Javascript and Ajax. HAP cannot get it unless it runs a browser in background and execute the scripts on the page.
.Net Core 2.0
前提条件:您需要在PC中安装Chrome网络浏览器.
Pre-requisites: you need Chrome web browser installed in your PC.
-
创建控制台应用程序
Create a console application
安装Nuget软件包安装软件包HtmlAgilityPack
安装软件包Selenium.WebDriver
Install-Package Selenium.Chrome.WebDriver
Install Nuget packagesInstall-Package HtmlAgilityPack
Install-Package Selenium.WebDriver
Install-Package Selenium.Chrome.WebDriver
通过以下方法替换 Main
方法
代码:
static void Main(string[] args)
{
string url = "https://www.sideshow.com/collectibles?manufacturer=Hot+Toys";
var browser = new ChromeDriver(Environment.CurrentDirectory);
browser.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(30);
browser.Navigate().GoToUrl(url);
var results = browser.FindElementByClassName("ss-results");
var doc = new HtmlDocument();
doc.LoadHtml(results.GetAttribute("innerHTML"));
// Show results
var list = doc.DocumentNode.SelectSingleNode("//div[@class='c-ProductList row ss-targeted']");
foreach (var title in list.SelectNodes(".//h2[@class='c-ProductListItem__title ng-binding']"))
{
Console.WriteLine(title.InnerText);
}
Console.ReadLine();
}
.Net 4.6
-
创建控制台应用程序
Create a console application
安装Nuget软件包安装软件包HtmlAgilityPack
Install Nuget package Install-Package HtmlAgilityPack
在 Solution Explorer 中添加对 System.Windows.Form
根据需要使用语句添加
Add using
statements as required
通过以下方法替换 Main
方法
代码:
[STAThread]
static void Main(string[] args)
{
string url = "https://www.sideshow.com/collectibles?manufacturer=Hot+Toys";
var web = new HtmlWeb();
web.BrowserTimeout = TimeSpan.FromSeconds(30);
var doc = web.LoadFromBrowser(url, o =>
{
var webBrowser = (WebBrowser)o;
// Wait until the list shows up
return webBrowser.Document.Body.InnerHtml.Contains("c-ProductList");
});
// Show results
var list = doc.DocumentNode.SelectSingleNode("//div[@class='c-ProductList row ss-targeted']");
foreach (var title in list.SelectNodes(".//h2[@class='c-ProductListItem__title ng-binding']"))
{
Console.WriteLine(title.InnerText);
}
Console.ReadLine();
}
显示以以下内容开头的列表:
Displays a list starting with:
John Wick
John Wick
惩罚者战争机器装甲
神奇女侠豪华版
这篇关于HTML Agility Pack如何在页面加载后获取动态生成的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!