本文介绍了让ETags正确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在读一本书,我对ETag章节有一个特别的疑问。作者说ETag可能会损害性能,你必须对它们进行精细调整或完全禁用它们。

I’ve been reading a book and I have a particular question about the ETag chapter. The author says that ETags might harm performance and that you must tune them finely or disable them completely.

我已经知道ETag是什么并且了解风险,但是它是难以获得ETag吗?

I already know what ETags are and understand the risks, but is it that hard to get ETags right?

我刚刚创建了一个发送ETag的应用程序,其值为响应主体的MD5哈希值。这是一个简单的解决方案,很容易用多种语言实现。

I’ve just made an application that sends an ETag whose value is the MD5 hash of the response body. This is a simple solution, easy to achieve in many languages.


  • 使用响应体的MD5哈希是否为ETag错误?如果是这样,为什么呢?

  • Is using MD5 hash of the response body as ETag wrong? If so, why?

为什么作者(明显超出我的许多数量级)不提出这么简单的解决方案?

Why the author (who obviously outsmarts me by many orders of magnitude) does not propose such a simple solution?

这个最后一个问题很难回答,除非你是作者:),所以我试图找到使用的弱点MD5哈希作为ETag。

This last question is hard to answer unless you are the author :), so I’m trying to find the weak points of using an MD5 hash as an ETag.

推荐答案

ETag类似于Last-Modified标头。这是一种由客户决定变更的机制。

ETag is similar to the Last-Modified header. It's a mechanism to determine change by the client.

可以说,一个只能发生最后修改日期(即同一文本)的ETag符合所有必要的标准。一个ETag。它只需要是表示资源状态的唯一值。在整个资源域中并不是唯一的,只需在资源范围内。

Arguably, an ETag that JUST HAPPENS to be the Last Modified date (i.e. the same text) meets all the criteria necessary for an ETag. It simply needs to be a unique value representing the state of a resource. Not unique across the entire domain of resources, simply within the resource.

现在,从技术上讲,ETag与Last-Modified标头相比具有无限分辨率。 Last-Modified仅以1秒的粒度更改,而ETag可以是次秒。

Now, technically, an ETag has "infinite" resolution compared to a Last-Modified header. Last-Modified only changes at a granularity of 1 second, whereas an ETag can be sub second.

您可以同时实现ETag和Last-Modified,或者只是一个或者其他(或当然没有)。如果你的Last-Modified是不够的,那么考虑一个ETag。

You can implement both ETag and Last-Modified, or simply one or the other (or none, of course). If you Last-Modified is not sufficient, then consider an ETag.

请注意,我不会为每个资源设置ETag。基本上,我不会将它设置为任何不期望被缓存的东西(特别是动态内容)。在这种情况下没有意义,只是浪费了工作。

Mind, I would not set ETag for "every" resource. Basically, I wouldn't set it for anything that has no expectation of being cached (dynamic content notably). There's no point in that case, just wasted work.

编辑:我看到你的编辑,并澄清。

I see your edit, and clarify.

MD5很好。唯一的缺点是一直在计算MD5。例如,在200K PDF文件上运行MD5是很昂贵的。在没有预期被缓存的资源上运行MD5简直就是浪费(即动态内容)。

MD5 is fine. The only downside is calculating MD5 all the time. Running MD5 on, say, a 200K PDF file, is expensive. Running MD5 on a resource that has no expectation of being cached is simply wasteful (i.e. dynamic content).

诀窍很简单,无论你使用什么机制,它应该是通常是Last-Modified便宜的。 Last-Modified通常也是资源的属性,通常非常便宜。

The trick is simply that whatever mechanism you use, it should be as cheap as Last-Modified typically is. Last-Modified is, again, typically, a property of the resource, and usually very cheap to access.

ETag应该同样便宜。如果您使用的是MD5,并且可以缓存/存储资源和MD5哈希之间的关联,那么这是一个很好的解决方案。但是,每次需要ETag时重新计算MD5,基本上与使用ETag提高整体服务器性能的想法相反。

ETags should be similarly cheap. If you are using MD5, and you can cache/store the association between the resource and the MD5 hash, then that's a fine solution. However, recalculating the MD5 each time the ETag is necessary, is basically counter to the idea of using ETags to improve overall server performance.

这篇关于让ETags正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-28 22:32