我正在尝试获取某些页面的状态码。
问题是默认的GetAsync方法返回整个页面的内容,而我只需要标题即可检查页面的状态(404,403,等等。),由于我必须检查大量的URI,最终会占用大量内存。
我添加了ResponseHeadersRead选项来解决该内存占用问题,但随后该代码开始引发“任务已取消”异常,这意味着超时。
我知道的事情:
当我在本地PC上运行fiddler(Http / Https Debugger)时,ResponseHeadersRead代码仅适用。
ResponseHeadersRead代码可在在线编码环境(例如dotnetfiddle)下工作。但不适用于Windows OS环境。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Net;
using System.Security.Cryptography;
public class Program
{
public static string[] Tags = { "first", "second" };
public static string prefix = null;
static HttpClient Client = new HttpClient();
public static void Main()
{
System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
Client.DefaultRequestHeaders.ConnectionClose = true;
// limit parallel thread
Parallel.ForEach(Tags,
new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 1.0)) },
tag =>
{
for (int i = 1; i < 4; i++)
{
switch (i)
{
case 1:
prefix = "1";
break;
case 2:
prefix = "2";
break;
case 3:
prefix = "3";
break;
}
Console.WriteLine(tag.ToString() + " and " + i);
HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix).Result; // this works
// HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix,HttpCompletionOption.ResponseHeadersRead).Result; // this fails from 2nd try with one url.
Console.WriteLine(i + " and " + (int)response.StatusCode);
if (response.StatusCode != HttpStatusCode.NotFound)
{
}
}
});
}
}
它不是通过使用ResponseHeadersRead来获取线程超时的,而是没有使用它。
最佳答案
请勿将Parallel
用于async
代码,它用于CPU绑定。您可以同时运行所有请求,而不会浪费线程阻塞它。解决此问题的方法是不增加DefaultConnectionLimit
,但是在这种情况下可以解决。处理ResponseHeadersRead
的正确方法是Dispose
response
,即
using(HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead).Result) {}
或读取响应的
Content
。var data = response.ReadAsStringAsync().Result;
对于
ResponseHeadersRead
,您需要执行此操作以关闭连接。我鼓励您重写此代码以摆脱Parallel
而不是在.Result
调用中调用async
。您可以执行以下操作:
private static async Task Go()
{
System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
Client.DefaultRequestHeaders.ConnectionClose = true;
var tasks = Tags.Select(tag =>
{
var requests = new List<Task>();
for (int i = 1; i < 4; i++)
{
switch (i)
{
case 1:
prefix = "1";
break;
case 2:
prefix = "2";
break;
case 3:
prefix = "3";
break;
}
requests.Add(MakeRequest(Client, prefix, tag));
}
return requests;
}).SelectMany(t => t);
await Task.WhenAll(tasks);
}
private async static Task MakeRequest(HttpClient client, string prefix, string tag)
{
using (var response = await client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead))
{
Console.WriteLine(tag + " and " + prefix);
Console.WriteLine(prefix + " and " + (int)response.StatusCode);
}
}