问题描述
我正在测试并行执行 IWebDriver
vs WebClient
。
(如果有性能差异,多大)
在我设法这样做之前,我有简单的WebClient并行调用问题。 >
似乎还没有执行,我在特定行上的 AgilityPacDocExtraction
上放了一个刹车点 WebClient.DownloadString(URL)
但程序退出
而不是调试 Step Into
可能会显示yeald字符串。
该计划是为了采取所有需要采取的单一方法,
通过每个动作的模式选择器,
然后使用一个简单的 foreach
,它将遍历所有可用的枚举值
- 模式
主要版本:
static void Main(string [] args)
{
EnumForEach< Action>(Execute);
Task.WaitAll();
}
public static void EnumForEach< Mode>(Action< Mode> Exec)
{
foreach(Enum.GetValues(typeof(Mode))中的模式模式) )
{
Mode Curr = mode;
Task.Factory.StartNew(()=> Exec(Curr));
}
}
模式/动作选择器
枚举动作
{
Act1,Act2
}
实际执行
static BrowsresFactory.IeEngine IeNgn = new BrowsresFactory.IeEngin();
static string
FlNm = Environment.CurrentDirectory,
URL =,
TmpHtm =;
static void Execute(Action Exc)
{
switch(Exc)
{
case Action.Act1:
break;
case Action.Act2:
URL =UrlofUrChoise here ...;
FlNm + =\\TempHtm.htm;
TmpHtm = IeNgn.AgilityPacDocExtraction(URL).GetElementbyId(Dv_Main)。InnerHtml;
File.WriteAllText(FlNm,TmpHtm);
break;
}
}
持有 WebClient
和 IWebDriver
(由硒)不包括在这里,所以这不会占用更多的空间在这个职位,现在不再相关了。
class BrowsresFactory
{
public class IeEngine
{
private WebClient WC = new WebClient();
private string tmpExtractedPageValue =;
private HtmlAgilityPack.HtmlDocument retAglPacHtmDoc = new HtmlAgilityPack.HtmlDocument();
public HtmlAgilityPack.HtmlDocument AgilityPacDocExtraction(string URL)
{
WC.Encoding = Encoding.GetEncoding(UTF-8);
tmpExtractedPageValue = WC.DownloadString(URL); //< ---尝试破解
retAglPacHtmDoc.LoadHtml(tmpExtractedPageValue);
return retAglPacHtmDoc;
}
}
}
问题是我不能通过从WebClient提取的值来查看文件中应该被改变的任何内容,加上在调试模式下,我无法进入上述代码中注释的行。我在做什么错在这里?
我已经设法解决这个问题,使用 WebClient
我认为需要比 WebDriver
更少的资源,如果真的这也意味着花费更少的时间。
这是代码:
public void StartEngins()
{
const string URL_Dollar =URL_Dollar;
const string URL_UpdateUsersTimeOut =URL_UpdateUsersTimeOut;
var urlList = new Dictionary< string,string>();
urlList.Add(URL_Dollar,http://bing.com);
urlList.Add(URL_UpdateUsersTimeOut,http:// localhost:.... / ....... aspx);
var htmlDictionary = new ConcurrentDictionary< string,string>();
Parallel.ForEach(
urlList.Values,
new ParallelOptions {MaxDegreeOfParallelism = 20},
url =>下载(url,htmlDictionary)
);
foreach(var pair in htmlDictionary)
{
/// Process(pair);
MessageBox.Show(pair.Value);
}
}
public class SmartWebClient:WebClient
{
private readonly int maxConcurentConnectionCount;
public SmartWebClient(int maxConcurentConnectionCount = 20)
{
this.maxConcurentConnectionCount = maxConcurentConnectionCount;
}
protected override WebRequest GetWebRequest(Uri address)
{
var httpWebRequest =(HttpWebRequest)base.GetWebRequest(address);
if(httpWebRequest == null)
{
return null;
}
if(maxConcurentConnectionCount!= 0)
{
httpWebRequest.ServicePoint.ConnectionLimit = maxConcurentConnectionCount;
}
返回httpWebRequest;
}
}
i am testing parallel execution of IWebDriver
vs WebClient
.(if there's performance differance and how big it is)
before i managed to do so , i had problem with simple WebClient- Parallel invocation .
seems that it has not been executed, i did put a brake point on the AgilityPacDocExtraction
at the specific line of WebClient.DownloadString(URL)
but the program exitsinstead of debug Step Into
could show yeald string .
the plan was to have single method for all actions needed to be taken,via a "mode" selector for each action,then using a simple foreach
that will iterate on all available Enum values
- modes
the main exeutions :
static void Main(string[] args)
{
EnumForEach<Action>(Execute);
Task.WaitAll();
}
public static void EnumForEach<Mode>(Action<Mode> Exec)
{
foreach (Mode mode in Enum.GetValues(typeof(Mode)))
{
Mode Curr = mode;
Task.Factory.StartNew(() => Exec(Curr) );
}
}
mode / Action selector
enum Action
{
Act1, Act2
}
the actual execution
static BrowsresFactory.IeEngine IeNgn = new BrowsresFactory.IeEngin();
static string
FlNm = Environment.CurrentDirectory,
URL = "",
TmpHtm ="";
static void Execute(Action Exc)
{
switch (Exc)
{
case Action.Act1:
break;
case Action.Act2:
URL = "UrlofUrChoise here...";
FlNm += "\\TempHtm.htm";
TmpHtm = IeNgn.AgilityPacDocExtraction(URL).GetElementbyId("Dv_Main").InnerHtml;
File.WriteAllText(FlNm, TmpHtm);
break;
}
}
class that hold WebClient
and IWebDriver
(by selenium) not included here so it will not take some more room in this post and allso not relevent for now.
class BrowsresFactory
{
public class IeEngine
{
private WebClient WC = new WebClient();
private string tmpExtractedPageValue = "";
private HtmlAgilityPack.HtmlDocument retAglPacHtmDoc = new HtmlAgilityPack.HtmlDocument();
public HtmlAgilityPack.HtmlDocument AgilityPacDocExtraction(string URL)
{
WC.Encoding = Encoding.GetEncoding("UTF-8");
tmpExtractedPageValue = WC.DownloadString(URL); //<--- tried to break here
retAglPacHtmDoc.LoadHtml(tmpExtractedPageValue);
return retAglPacHtmDoc;
}
}
}
the problem is that i cant see any content in the file that was supposed to be alterd via value extracted from the WebClient , plus when in debug mode i couldn't step into the line commented in above code. what am i doing Wrong here ?
I have managed to solve the issue by making a use of WebClient
which I think requires less resources than WebDriver
and if thats true it also means that takes less time.
This is the code :
public void StartEngins()
{
const string URL_Dollar = "URL_Dollar";
const string URL_UpdateUsersTimeOut = "URL_UpdateUsersTimeOut";
var urlList = new Dictionary<string, string>();
urlList.Add(URL_Dollar, "http://bing.com");
urlList.Add(URL_UpdateUsersTimeOut, "http://localhost:..../.......aspx");
var htmlDictionary = new ConcurrentDictionary<string, string>();
Parallel.ForEach(
urlList.Values,
new ParallelOptions { MaxDegreeOfParallelism = 20 },
url => Download(url, htmlDictionary)
);
foreach (var pair in htmlDictionary)
{
///Process(pair);
MessageBox.Show(pair.Value);
}
}
public class SmartWebClient : WebClient
{
private readonly int maxConcurentConnectionCount;
public SmartWebClient(int maxConcurentConnectionCount = 20)
{
this.maxConcurentConnectionCount = maxConcurentConnectionCount;
}
protected override WebRequest GetWebRequest(Uri address)
{
var httpWebRequest = (HttpWebRequest)base.GetWebRequest(address);
if (httpWebRequest == null)
{
return null;
}
if (maxConcurentConnectionCount != 0)
{
httpWebRequest.ServicePoint.ConnectionLimit = maxConcurentConnectionCount;
}
return httpWebRequest;
}
}
这篇关于多个并行执行WebClient作为任务(TPL)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!