我正在尝试获取某些页面的状态码。

问题是默认的GetAsync方法返回整个页面的内容,而我只需要标题即可检查页面的状态(404,403,等等。),由于我必须检查大量的URI,最终会占用大量内存。

我添加了ResponseHeadersRead选项来解决该内存占用问题,但随后该代码开始引发“任务已取消”异常,这意味着超时。

我知道的事情:


当我在本地PC上运行fiddler(Http / Https Debugger)时,ResponseHeadersRead代码仅适用。
ResponseHeadersRead代码可在在线编码环境(例如dotnetfiddle)下工作。但不适用于Windows OS环境。


using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Net;
using System.Security.Cryptography;


public class Program
{
    public static string[] Tags = { "first", "second" };
    public static string prefix = null;
    static HttpClient Client = new HttpClient();
    public static void Main()
    {
        System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
        Client.DefaultRequestHeaders.ConnectionClose = true;

        // limit parallel thread
        Parallel.ForEach(Tags,
        new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 1.0)) },
        tag =>
        {
            for (int i = 1; i < 4; i++)
            {
                switch (i)
                {
                    case 1:
                        prefix = "1";
                        break;
                    case 2:
                        prefix = "2";
                        break;
                    case 3:
                        prefix = "3";
                        break;
                }
                Console.WriteLine(tag.ToString() + " and " + i);
                HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix).Result; // this works
//                HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix,HttpCompletionOption.ResponseHeadersRead).Result; // this fails from 2nd try with one url.
                Console.WriteLine(i + " and " + (int)response.StatusCode);
                if (response.StatusCode != HttpStatusCode.NotFound)
                {

                }

            }
        });

    }
}


它不是通过使用ResponseHeadersRead来获取线程超时的,而是没有使用它。

c# - 不使用Fiddler(Http/Https调试器)的情况下,带有ResponseHeadersRead的HttpClient在第二次GetAsync尝试失败(超时)-LMLPHP

最佳答案

请勿将Parallel用于async代码,它用于CPU绑定。您可以同时运行所有请求,而不会浪费线程阻塞它。解决此问题的方法是不增加DefaultConnectionLimit,但是在这种情况下可以解决。处理ResponseHeadersRead的正确方法是Dispose response,即

using(HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead).Result) {}


或读取响应的Content

var data = response.ReadAsStringAsync().Result;


对于ResponseHeadersRead,您需要执行此操作以关闭连接。我鼓励您重写此代码以摆脱Parallel而不是在.Result调用中调用async

您可以执行以下操作:

private static async Task Go()
{
    System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
    Client.DefaultRequestHeaders.ConnectionClose = true;

    var tasks = Tags.Select(tag =>
    {
        var requests = new List<Task>();
        for (int i = 1; i < 4; i++)
        {
            switch (i)
            {
                case 1:
                    prefix = "1";
                    break;
                case 2:
                    prefix = "2";
                    break;
                case 3:
                    prefix = "3";
                    break;
            }

            requests.Add(MakeRequest(Client, prefix, tag));
        }
        return requests;
    }).SelectMany(t => t);

    await Task.WhenAll(tasks);
}

private async static Task MakeRequest(HttpClient client, string prefix, string tag)
{

    using (var response = await client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead))
    {
        Console.WriteLine(tag + " and " + prefix);
        Console.WriteLine(prefix + " and " + (int)response.StatusCode);
    }
}

09-12 23:29