问题描述
我有一个方法,如果给一个特定的URL来获得id和XPath的。我如何通过与请求的用户名和密码,这样我可以凑一个需要用户名和密码的网址?
使用HtmlAgilityPack;
_Web =新HtmlWeb();
内部字典<字符串,字符串> GetidsAndXPaths(字符串URL)
{
变种webidsAndXPaths =新词典<字符串,字符串>();
VAR DOC = _web.Load(URL);
变种节点= doc.DocumentNode.SelectNodes(// * [@ id中]);
如果(节点== NULL)回报webidsAndXPaths;
//代码来获取所有的XPath和IDS
我应该使用Web请求获得页面的源代码,然后将该文件传递到上面的方法?
VAR WC =新的WebClient();
wc.Credentials =新的NetworkCredential(用户名,密码);
wc.DownloadFile(http://somewebsite.com/page.aspx,@C:\localfile.html);
HtmlWeb.Load
有一些重载,这些接受要么的NetworkCredential
也可以在用户名和直接密码通过。
//名称说明
公共方法Load(字符串)//获取来自互联网资源的HTML文档。
公共方法Load(字符串,字符串)//加载来自互联网资源的HTML文档。
公共方法Load(字符串,字符串,WebProxy,的NetworkCredential)//加载来自互联网资源的HTML文档。
公共方法Load(字符串,字符串,的Int32,字符串,字符串)//加载来自互联网资源的HTML文档。
您不需要在 WebProxy
实例,或者您也可以在系统默认的通过。
另外,您可以连线了 HtmlWeb.PreRequestHandler
,和设置请求的凭据
htmlWeb.PreRequestHandler + =(要求)=> {
request.Credentials =新的NetworkCredential(...);
返回真;
};
I have a method to get ids and xpaths if given a particular url. How do I pass in the username and password with the request so that I can scrape a url that requires a username and password?
using HtmlAgilityPack;
_web = new HtmlWeb();
internal Dictionary<string, string> GetidsAndXPaths(string url)
{
var webidsAndXPaths = new Dictionary<string, string>();
var doc = _web.Load(url);
var nodes = doc.DocumentNode.SelectNodes("//*[@id]");
if (nodes == null) return webidsAndXPaths;
// code to get all the xpaths and ids
Should I use a web request to get the page source and then pass that file into the method above?
var wc = new WebClient();
wc.Credentials = new NetworkCredential("UserName", "Password");
wc.DownloadFile("http://somewebsite.com/page.aspx", @"C:\localfile.html");
HtmlWeb.Load
has a number of overloads, these accept either an instance of NetworkCredential
or you can pass in a username and password directly.
Name // Description
Public method Load(String) //Gets an HTML document from an Internet resource.
Public method Load(String, String) //Loads an HTML document from an Internet resource.
Public method Load(String, String, WebProxy, NetworkCredential) //Loads an HTML document from an Internet resource.
Public method Load(String, String, Int32, String, String) //Loads an HTML document from an Internet resource.
You do not need to pass in a WebProxy
instance, or you can pass in the system default one.
Alternatively you can wire up the HtmlWeb.PreRequestHandler
and setup the credentials for the request.
htmlWeb.PreRequestHandler += (request) => {
request.Credentials = new NetworkCredential(...);
return true;
};
这篇关于HtmlAgilityPack和验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!