无法使用ScrapySharp抓取网页数据

无法使用ScrapySharp抓取网页数据

本文介绍了无法使用ScrapySharp抓取网页数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 

大家好,

我正面临技术问题。

我浏览了几篇文章以找到答案,但我无法从任何网站上得到正确答案



我正在使用ScrapySharp为我的项目抓取网页数据。

当我尝试从

http://edition.cnn.com/POLITICS网站抓取数据时出现此问题。

首先,我通过IE加载页面,然后我选择了Developer工具来检查标签。

在我选择标签之后,我需要代码"// div [@ class ='cd__content']",

此外,当我通过ScrapySharp加载上述网页时

ScrapingBrowser browser = new ScrapingBrowser();
WebPage rootPage = browser.NavigateToPageAsync(new Uri(url));
HtmlNodeCollection rootNodes = rootPage.Html.SelectNodes("// div [@ class ='cd__content']");

rootNodes的结果显示为null

当我深入调查时,我看到的是上面提到的cd__content在

"SECTION"中当页面加载"SECTION"标签时标记为空。

但是当我通过IE或Chrome检查时,所有标签都填充了信息

这就是我能够选择元素的原因,

但是当我以编程方式加载页面时,它不会。

我的问题是,如何使用ScrapySharp加载填充所有信息的页面

。专家,请帮忙。






解决方案

Hi all, I am facing a technical issue.

I browsed several articles to find the answer but I couldn’t get a proper answer

from any web site. I am using ScrapySharp for my project to crawl web page data.

This issue came when I try to crawl data from the

http://edition.cnn.com/POLITICS website. Firstly, I loaded the page via IE, and I selected Developer tools to inspect the tags.

After the I selected the tag what I need for my code "//div[@class='cd__content']",

Moreover when I load the above mentioned web page through ScrapySharp ScrapingBrowser browser = new ScrapingBrowser(); WebPage rootPage = browser.NavigateToPageAsync(new Uri(url)); HtmlNodeCollection rootNodes = rootPage.Html.SelectNodes("//div[@class='cd__content']"); The result for rootNodes shows as null When I investigate deep, What I saw is the above-mentioned cd__content is inside the

"SECTION" tag when the page loads the "SECTION" tag is empty.

But when I Inspect via IE or Chrome all tags are filled with information

that’s why I could able to pick the element,

but when I load the page programmatically it won’t. My question is, how can I load the page with filling all information

using ScrapySharp. Experts, Please help on this.



解决方案


这篇关于无法使用ScrapySharp抓取网页数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-22 21:09