问题描述
前段时间我曾询问如何使用浏览器缓存。
这里我给出了一个问题的简短摘要 - 有关更多背景,特别是我想要这样做的原因,请参考旧问题。
我希望你审查并改进我的解决方案想法(只是一个想法,所以不要发送给我:D)。
问题
客户端(单页应用程序) )从服务器获取相当大的列表。
这样可以正常工作并实际节省服务器资源
- 同一个列表可以提供给多个客户
- 并且客户端进行过滤和排序,而不会一次又一次地打扰服务器。
其中一些列表是用户 - 特定的,其他的是一组用户共同的,其他的是全球的。
所有这些列表可能随时更改,我们永远不想提供陈旧数据( Cache-Control
和 Expires
HTTP标头在这里没有直接使用。)
我们正在使用 304 NOT MODIFIED
,这有助于 nothing 的情况发生了变化。
当任何变化时,变化通常很小,但HTTP根本不支持这种情况,因此我们必须发送包括未更改部分的整个列表。
我们可以发送delta,但是没有明显的方法如何通过browswer有效缓存(缓存在 localStorage
或者类似的远远不是那么好正如我在链接问题中解释的那样。)
我们列表的一个重要属性是每个项目都有唯一的 id
和最后修改的时间戳
。
时间戳
允许我们通过查找最近更改过的项目来轻松计算增量。
id
允许我们仅通过替换相应的项来应用增量(列表在内部是 Map< Id,Item>
)。
这对于删除不起作用,但我们暂时忽略它们。
想法
我建议使用不同大小的多个列表(任何数字应该可以工作),较长的列表可以缓存。
我们假设,一天是合适的时间单位,让我们使用以下三个列表:
-
WEEK
这是包含所有项目的基本列表,因为它们在当前周中的任意时间存在。 -
DAY
包含本周更改的所有项目的列表除了今天,因为它们在当天中的任意时间存在。
今天更改的项目可能包括也可能不包括在内。 -
CURRENT
包含所有今天的项目的列表刚才。
客户端获取所有三个列表。它以 WEEK
开头,适用 DAY
(即插入新项目并替换旧项目),最后应用 CURRENT
。
一个例子
我们假设有1000个项目在列表中每天更改10件商品。
WEEK
列表包含所有1000件商品,但它可以是缓存到本周末。
未指定其确切内容,不同的客户可能有不同版本(只要上述条件中的条件成立)。
这允许服务器将数据缓存整整一周,但它也允许服务器删除它们,因为服务当前状态也很好。
DAY
列表最多包含70个项目,可以缓存到一天结束。
CURRENT
列表最多包含10个项目,只能在任何更改之前进行缓存。
通信
客户应该对使用的时间尺度一无所知,但需要知道要求的列表数量。 经典请求,例如
GET / api / order / 123 //获取包含最新内容$ b的整个列表$ b
将被三个请求所取代,例如
GET / api / 0,订单/ 123 //获取周列表
GET / api / 1,订单/ 123 //获取DAY列表
GET / api / 2,订单/ 123 //获取当前列表
问题
通常情况下,更改确实如上所述,但有时所有项目都会立即更改。
当发生这种情况时,所有三个列表都包含所有项目,这意味着我们必须提供三倍的数据。
幸运的是,此类事件非常罕见(例如,当我们添加属性时),但我想看到一种允许我们避免此类爆发的方法?
除了将项目标记为已删除并推迟物理删除直到缓存过期之外,是否有任何删除解决方案(即,在我的例子中,直到星期结束)。
任何改进?
我假设你理解你的方法存在以下一般问题:
- 与一大清单+ 304相比方法,这减少了网络流量,但增加了客户端处理时间:您的客户端代码仍然在热缓存上看到与冷缓存相同的响应,但现在有三个响应,具有重叠数据。
- 与
localStorage
方法相比,这有点落在聪明方面,对l有影响ong-term可维护性。必须使用明确的文档和测试套件。
假设这样,我喜欢你的方法。
我可能会改变一件事。它增加了一点灵活性,但也有点复杂。它可能是也可能不是一个好主意。
您可以在响应标头中发送显式超链接,而不是在客户端上硬编码三个URL。以下是它的工作原理:
客户请求硬编码的入口点:
> GET / api / order / 123?delta = all
< 200 OK
<缓存控制:max-age = 604800
< Delta-Location:/ api / order / 123?delta = 604800
<
< [...你的周列表...]
看到 Delta-位置
标头,然后客户端请求并应用结果delta:
> GET / api / order / 123?delta = 604800
< 200 OK
<缓存控制:max-age = 86400
< Delta-Location:/ api / order / 123?delta = 86400
<
< [...你的DAY列表...]
等等,直到回复没有 Delta-Location
。
这允许服务器随时单方面更改delta结构。 (当然,只要它可以缓存在客户端上,它仍然必须支持旧结构。)
特别是,这可以让你解决突发问题。执行质量更改后,您可以开始提供更小的增量(相应地更小 max-age
),这样它们就可以排除质量变化。然后,随着时间的推移,您将逐渐增加增量尺寸。这将涉及服务器端的额外逻辑/配置,但我相信如果爆发是你真正关心的问题,你可以弄清楚。
理想情况下,你会根据请求URL解析 Delta-Location
,因此其行为类似于标准和标题,用于统一性和灵活性。在JavaScript中执行此操作的一种方法是对象。
您可以通过此超链接方式调整其他内容:
- 你应该让
max-age
略小于delta
,计算网络延迟。 - 如果服务器(错误地)链接回先前的增量,您可能需要在客户端上使用额外的逻辑以避免无限循环。
- 您可以使用标准标头而不是非标准
Delta-Location
。但是你仍然需要一种非标准的关系类型,因此不清楚这会给你带来什么。
Some time ago I asked how to do Incremental updates using browser cache.Here I'm giving a short summary of the problem - for more context, especially the reason why I want to do this, please refer to the old question.I'd like you to review and improve my solution idea (just an idea, so don't send me to code review :D).
The problem
The client (a single page app) gets rather big lists from the server.This works fine and actually saves server resources as
- the same list can be served to multiple clients
- and the clients do the filtering and sorting without bothering the server again and again.
Some of these lists are user-specific, others are common to a group of users, others are global.All these lists may change anytime and we never want to serve stale data (the Cache-Control
and Expires
HTTP header are of no direct use here).
We're using 304 NOT MODIFIED
, which helps in case when nothing has changed.When anything changes, the changes are usually small, but HTTP does not support this case at all, so we have to send the whole list including the unchanged parts.We can send the delta instead, but there's no obvious way how this can be efficiently cached by the browswer (caching in localStorage
or alike is by far not as good as I explained in my linked question).
An important property of our lists is that every item has a unique id
and a last modified timestamp
.The timestamp
allows us to compute the delta easily by finding the items that have changed recently.The id
allows us to apply the delta simply by replacing the corresponding items (the list is internally a Map<Id, Item>
).This wouldn't work for deletions, but let's ignore them for now.
The idea
I'm suggesting to use multiple lists (any number should work) of varying sizes, with bigger list cacheable for a long time.Let's assume, a day is a suitable time unit and let's use the following three lists:
WEEK
This is the base list containing all items as they existed at some arbitrary time in the current week.DAY
A list containing all items which have changed this week except today as they existed at some arbitrary time in the current day.Items changed today may or may not be included.CURRENT
A list containing all items which have changed today as they exist just now.
The client gets all three lists. It starts with WEEK
, applies DAY
(i.e., inserts new items and replaces old ones) and finally applies CURRENT
.
An example
Let's assume there are 1000 items in the list with 10 items changing per day.
The WEEK
list contains all 1000 items, but it can be cached until the end of the week.Its exact content is not specified and different clients may have different versions of it (as long as the condition from the above bullet holds).This allows the server to cache the data for a whole week, but it also allows it to drop them as serving the current state is fine, too.
The DAY
list contains up to 70 items and can be cached until the end of a day.
The CURRENT
list contains up to 10 items and can only be cached until anything changes.
The communication
The client should know nothing about the used time scale, but it needs to know the number of lists to ask for. A "classical" request like
GET /api/order/123 // get the whole list with up to date content
will be replaced by three requests like
GET /api/0,order/123 // get the WEEK list
GET /api/1,order/123 // get the DAY list
GET /api/2,order/123 // get the CURRENT list
The questions
Usually the changes are indeed as described, but sometimes all items change at once.When this happens, then all three list contain all items, meaning that we have to serve three times as much data.Fortunately, such events are very rare (e.g., when we add an attribute), but I'd like to see a way allowing us to avoid such bursts?
Do you see any other problems with this idea?
Is there any solution for deletions apart from just marking the items as deleted and postponing the physical deletion until the caches expire (i.e., until the end of week in my example).
Any improvements?
I assume you understand the following general problems with your approach:
- Compared to the "one big list + 304" approach, this reduces network traffic, but increases client processing time: your client code still sees the same responses on a warm cache as on a cold cache, but now there are three of them, with overlapping data.
- Compared to the
localStorage
approach, this falls a bit on the "clever" side, with implications for long-term maintainability. Clear docs and a test suite are a must.
Assuming this, I like your approach.
There’s one thing I might change. It adds a bit of flexibility, but also a bit of complexity. It may or may not be a good idea.
Instead of hardcoding three URLs on the client, you could send explicit hyperlinks in response headers. Here’s how it might work:
The client requests a hardcoded "entry point":
> GET /api/order/123?delta=all
< 200 OK
< Cache-Control: max-age=604800
< Delta-Location: /api/order/123?delta=604800
<
< [...your WEEK list...]
Seeing the Delta-Location
header, the client then requests it and applies the resulting delta:
> GET /api/order/123?delta=604800
< 200 OK
< Cache-Control: max-age=86400
< Delta-Location: /api/order/123?delta=86400
<
< [...your DAY list...]
And so on, until the response has no Delta-Location
.
This allows the server to change the delta structure unilaterally at any time. (Of course, it still has to support the old structure for as long as it may be cached on the clients.)
In particular, this lets you solve the bursts problem. After performing a mass change, you could start serving much smaller deltas (with correspondingly smaller max-age
), such that they exclude the mass change. Then you would gradually increase the delta sizes as time goes by. This would involve extra logic/configuration on the server side, but I’m sure you can figure it out if the bursts are a real concern for you.
Ideally you would resolve Delta-Location
against the request URL, so it behaves like the standard Location
and Content-Location
headers, for uniformity and flexibility. One way to do that in JavaScript is the URL
object.
Further things you could tweak in this hyperlinks approach:
- You should probably make
max-age
slightly smaller thandelta
, to account for network delays. - You might need extra logic on the client to avoid an endless loop if the server (erroneously) links back to a previous delta.
- You could use the standard
Link
header instead of a non-standardDelta-Location
. But you’d still need a non-standard relation type, so it’s not clear what this would buy you.
这篇关于使用浏览器缓存进行增量更新的解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!