无法使用ScrapySharp抓取网页数据 | 无法使用ScrapySharp抓取网页数据

本文介绍了无法使用ScrapySharp抓取网页数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

 大家好，
 
我正面临技术问题。 
 我浏览了几篇文章以找到答案，但我无法从任何网站上得到正确答案
 。 
 
我正在使用ScrapySharp为我的项目抓取网页数据。 
 当我尝试从
  http://edition.cnn.com/POLITICS网站抓取数据时出现此问题。 
 
首先，我通过IE加载页面，然后我选择了Developer工具来检查标签。
 在我选择标签之后，我需要代码"// div [@ class ='cd__content']"，
 此外，当我通过ScrapySharp加载上述网页时
 
 ScrapingBrowser browser = new ScrapingBrowser（）; 
 WebPage rootPage = browser.NavigateToPageAsync（new Uri（url））; 
 HtmlNodeCollection rootNodes = rootPage.Html.SelectNodes（"// div [@ class ='cd__content']"）; 
 
 rootNodes的结果显示为null 
 
当我深入调查时，我看到的是上面提到的cd__content在
 "SECTION"中当页面加载"SECTION"标签时标记为空。
 但是当我通过IE或Chrome检查时，所有标签都填充了信息
 这就是我能够选择元素的原因，
 但是当我以编程方式加载页面时，它不会。 
 
我的问题是，如何使用ScrapySharp加载填充所有信息的页面
 。专家，请帮忙。

解决方案

Hi all, I am facing a technical issue.

I browsed several articles to find the answer but I couldn’t get a proper answer

from any web site. I am using ScrapySharp for my project to crawl web page data.

This issue came when I try to crawl data from the

http://edition.cnn.com/POLITICS website. Firstly, I loaded the page via IE, and I selected Developer tools to inspect the tags.

After the I selected the tag what I need for my code "//div[@class='cd__content']",

Moreover when I load the above mentioned web page through ScrapySharp ScrapingBrowser browser = new ScrapingBrowser(); WebPage rootPage = browser.NavigateToPageAsync(new Uri(url)); HtmlNodeCollection rootNodes = rootPage.Html.SelectNodes("//div[@class='cd__content']"); The result for rootNodes shows as null When I investigate deep, What I saw is the above-mentioned cd__content is inside the

"SECTION" tag when the page loads the "SECTION" tag is empty.

But when I Inspect via IE or Chrome all tags are filled with information

that’s why I could able to pick the element,

but when I load the page programmatically it won’t. My question is, how can I load the page with filling all information

using ScrapySharp. Experts, Please help on this.

解决方案

这篇关于无法使用ScrapySharp抓取网页数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！