问题描述
如果我有一个要固定大小为 N 的数组,以便缓存最新的 N 个项目,则一次限制 N 已达到,我将必须摆脱最旧的项目,同时添加最新的项目。
If I have an array that I want to be of fixed size N for the purpose of caching the most recent of N items, then once limit N is reached, I'll have to get rid of the oldest item while adding the newest item.
注意:我不会不必关心最新的项目是在数组的开头还是结尾,只要这些项目按照添加的顺序被删除即可。
Note: I don't care if the newest item is at the beginning or end of the array, just as long as the items get removed in the order that they are added.
显而易见方式是:
push()
和 shift ()
(这样 cache [0]
包含最旧的项目),或 unshift()
和 pop()
(这样 cache [0]
包含最新项) push()
and shift()
(so that cache[0]
contains the oldest item), orunshift()
and pop()
(so that cache[0]
contains the newest item)基本思路:
var cache = [], limit = 10000;
function cacheItem( item ) {
// In case we want to do anything with the oldest item
// before it's gone forever.
var oldest = [];
cache.push( item );
// Use WHILE and >= instead of just IF in case the cache
// was altered by more than one item at some point.
while ( cache.length >= limit ) {
oldest.push( cache.shift() );
}
return oldest;
}
但是,我读过有关的内存问题shift
和 unshift
,因为它们更改了数组的开头并移动了其他所有内容,但不幸的是,其中的 方法必须采用这种方式!
However, I've read about memory issues with shift
and unshift
since they alter the beginning of the array and move everything else around, but unfortunately, one of those methods has to be used to do it this way!
问:
还有其他方法可以达到更好的性能吗?
如果我已经提到的两种方法是最好的,是否有特定的优点/缺点?我需要知道吗?
结论
在对数据结构进行了更多研究之后(我从未使用过其他语言进行编程,因此,如果它不是Javascript的本机,我可能还没有听说过!),然后进行一堆在具有大小阵列的浏览器中的基准测试,无论是大大小小的读取/写入,都是我发现的:
After doing some more research into data structures (I've never programmed in other languages, so if it's not native to Javascript, I likely haven't heard of it!) and doing a bunch of benchmarking in multiple browsers with both small and large arrays as well as small and large numbers of reads / writes, here's what I found:
贝吉(Bergi)提出的循环缓冲区方法是目前为止最好的性能(由于答案和评论中所述的原因),因此已被接受为答案。但是,它不是那么直观,并且很难编写自己的额外功能(因为您始终必须考虑
偏移量
)。如果您要使用此方法,建议您使用已创建的方法,例如实际上是一个非常好的选择。我最初不理会它,因为它似乎不需要很多额外的东西,但令我惊讶的是。... - The 'circular buffer' method proposed by Bergi is hands-down THE best as far performance (for reasons explained in the answer and comments), and hence it has been accepted as the answer. However, it's not as intuitive, and makes it difficult to write your own 'extra' functions (since you always have to take
offset
into account). If you're going to use this method, I recommend an already-created one like this circular buffer on GitHub.The 'pop/unpush' method is much more intuitive, and performs fairly well, accept at the most extreme numbers.The 'copyWithin' method is, sadly, terrible for performance (tested in multiple browsers), quickly creating unacceptable latency. It also has no IE support. It's such a simple method! I wish it worked better.The 'linked list' method, proposed in the comments by Felix Kling, is actually a really good option. I initially disregarded it because it seemed like a lot of extra stuff I didn't need, but to my surprise....我真正需要的是一个(使用双向链接列表)。现在,由于我没有在原始问题中指定其他要求,因此我仍将Bergi的答案标记为对该特定问题的最佳答案。但是,由于我需要知道缓存中是否已经存在一个值,并且如果存在,请将其标记为缓存中的最新项,因此必须向循环缓冲区的 add()中添加其他逻辑。
方法(主要是 indexOf()
)使它的效率不比'pop / unpush'方法高。但是,在这种情况下,LRUMap的性能使其他两个功能都无法自拔!
What I actually needed was a Least Recently Used (LRU) Map (which employs a doubly-linked list). Now, since I didn't specify my additional requirements in my original question, I'm still marking Bergi's answer as the best answer to that specific question. However, since I needed to know if a value already existed in my cache, and if so, mark it as the newest item in the cache, the additional logic I had to add to my circular buffer's add()
method (primarily indexOf()
) made it not much more efficient than the 'pop/unpush' method. HOWEVER, the performance of the LRUMap in these situations blew both of the other two out of the water!
所以总结一下:
链接列表-大多数选项,同时仍保持出色的性能
循环缓冲区-仅添加和获取的最佳性能
Pop / Unpush-最直观,最简单
copyWithin
-当前性能糟糕,没有理由使用 - Linked List -- most options while still maintaining great performanceCircular Buffer -- best performance for just adding and gettingPop / Unpush -- most intuitive and simplest
copyWithin
-- terrible performance currently, no reason to use
推荐答案
不想复制数组中的内容,每次都会花费 O(n)
步。
You are not looking to copy stuff around within the array, which would take O(n)
steps every time.
相反,这是的理想用例。只需保持列表的开始和结束的偏移量,然后使用该偏移量并对其长度取模就可以访问缓冲区。
Instead, this is the perfect use case for a ring buffer. Just keep an offset to the "start" and "end" of the list, then access your buffer with that offset and modulo its length.
var cache = new Array(10000);
cache.offset = 0;
function cacheItem(item) {
cache[cache.offset++] = item;
cache.offset %= cache.length;
}
function cacheGet(i) { // backwards, 0 is most recent
return cache[(cache.offset - 1 - i + cache.length) % cache.length];
}
这篇关于Javascript:有效地将项目移入和移出固定大小的数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!