问题描述
我使用的是C#,并且我一直在努力挣扎几天,从URL中获取最终呈现的HTML。
我试过使用多个浏览器引擎,Awesomium,WebBrowser等,但它们都没有返回页面的实际呈现HTML,就像我右键单击铬和选择检查元素。
我所做的大致如下(使用):
public static string GetDomSource(WebBrowser wb)
{
var dd = wb.Document.DomDocument as IHTMLDocument2;
return dd.body.parentElement.outerHTML;
$ / code>
(虽然我不知道你是否已经尝试了这个或者你是否为了引入 IHTMLDocument2
接口,我添加了一个对Microsoft .mshtml程序集。
I'm using C#, and I've been struggling for a few days for grabbing the final rendered HTML from an URL.
I've tried using several browser engines, Awesomium, WebBrowser and so on, but none of them returns the actual rendered HTML of the page, as if I right clicked in chrome and chose "inspect element".
What I do is roughly the following (using the WebBrowser
WinForms control):
public static string GetDomSource(WebBrowser wb)
{
var dd = wb.Document.DomDocument as IHTMLDocument2;
return dd.body.parentElement.outerHTML;
}
(Though I don't know whether you already tried this or whether you are using WinForms at all).
To introduce the IHTMLDocument2
interface, I've add a reference to the "Microsoft.mshtml" assembly.
这篇关于在Javascript和ect之后获取最终呈现的HTML代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!