背景

我试图将S3用作某些“相当”静态XML文档的“无限”大型缓存层。我要确保客户端应用程序(将同时在数千台计算机上运行,​​并且每小时请求多次XML文档)仅在这些XML文档的内容自客户端应用程序上次下载它们以来已更改时才下载。

方法

在Amazon S3上,我们可以为此使用HTTP ETAG。默认情况下,Amazon S3对象的ETAG设置为对象的MD5哈希值。

然后,我们可以在GetObjectRequest.ETagToNotMatch属性内指定XML文档的MD5哈希值。这确保了当我们进行AmazonS3.GetObject调用(或在我的情况下为异步版本AmazonS3.BeginGetObjectAmazonS3.EndGetObject)时,如果所请求的文档具有与GetObjectRequest.ETagToNotMatch中包含的相同的MD5哈希,则S3自动返回HTTP状态代码304(未修改),并且XML文档的实际内容是而不是下载的。

问题

但是,问题在于,当调用AmazonS3.GetObject(或异步等效项)时,Amazon .Net API实际上将HTTP状态代码304(未修改)视为错误,并且它重试get请求三次,然后最终抛出Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3

显然,我可以更改此实现以使用AmazonS3.GetObjectMetaData,然后比较ETAG并使用AmazonS3.GetObject(如果它们不匹配),但是当文件过时时,有两个对S3的请求,而不是一个。无论XML文档是否需要下载,我都希望有一个请求。

有任何想法吗?这是一个错误还是我错过了什么?是否有某种方法可以将重试次数减少到一个并“处理”异常(尽管我对此方法感到“不满意”)。

实现

我正在使用适用于.NET(版本1.3.14)的AWS开发工具包。

这是我的实现(略微缩小以使其更短):

public Task<GetObjectResponse> DownloadString(string key, string etag = null) {

    var request = new GetObjectRequest { Key = key, BucketName = Bucket };

    if (etag != null) {
        request.ETagToNotMatch = etag;
    }

    var task = Task<GetObjectResponse>.Factory.FromAsync(_s3Client.BeginGetObject, _s3Client.EndGetObject, request, null);

    return task;
}

然后我这样称呼:
var dlTask          = s3Manager.DownloadString("new one", "d7db7bc318d6eb9222d728747879b52e");
var responseTasks   = new[]
    {
        dlTask.ContinueWith(x => _log.Error("Error downloading string.", x.Exception), TaskContinuationOptions.OnlyOnFaulted),
        dlTask.ContinueWith(x => _log.Warn("Downloading string was cancelled."), TaskContinuationOptions.OnlyOnCanceled),
        dlTask.ContinueWith(x => _log.Info(string.Format("Done with download: {0}", x.Result.ETag)), TaskContinuationOptions.OnlyOnRanToCompletion)
    };

try {
    Task.WaitAny(responseTasks);
} catch (AggregateException aex) {
    _log.Error("Error while processing download string.", aex);
}

_log.Info("Exiting...");

然后产生以下日志文​​件输出:
2011-10-11 13:21:20,376 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.6140812.
2011-10-11 13:21:20,385 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:20,789 [11] INFO  Amazon.S3.AmazonS3Client - Retry number 1 for request GetObject.
2011-10-11 13:21:22,329 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.1400356.
2011-10-11 13:21:22,329 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:23,929 [11] INFO  Amazon.S3.AmazonS3Client - Retry number 2 for request GetObject.
2011-10-11 13:21:26,508 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:00.9790314.
2011-10-11 13:21:26,508 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:32,908 [11] INFO  Amazon.S3.AmazonS3Client - Retry number 3 for request GetObject.
2011-10-11 13:21:40,604 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.2950718.
2011-10-11 13:21:40,605 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:40,621 [11] ERROR Amazon.S3.AmazonS3Client - Error for GetResponse
Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
   at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
   at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
   at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
2011-10-11 13:21:40,635 [10] INFO  Example.Program - Exiting...
2011-10-11 13:21:40,638 [19] ERROR Example.Program - Error downloading string.
System.AggregateException: One or more errors occurred. ---> Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
   at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
   at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
   at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs)
   --- End of inner exception stack trace ---
---> (Inner Exception #0) Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
   at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
   at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
   at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs)<---

最佳答案

我还在亚马逊开发者论坛上发布了这个问题,并得到了AWS正式员工的答复:



我已经将我的想法添加到了主题中,如果其他人有兴趣参加,请点击以下链接:

AmazonS3.GetObject sees HTTP 304 (NotModified) as an error. Way to allow it?

注意:当Amazon解决了此问题后,我将更新我的答案以反射(reflect)结果。

更新:(2012-01-24)仍在等待来自亚马逊的进一步信息。

更新:(2018-12-06)此问题已在2013年AWS开发工具包1.5.20中修复https://forums.aws.amazon.com/thread.jspa?threadID=77995&tstart=0

关于c# - 如何在Amazon S3 GetObject API内部指定HTTP状态代码304(NotModified)不是错误情况?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/7726600/

10-11 07:09