问题描述
我有数据库项目,除了它们的主键外,还需要一个对项目所属组唯一的索引.我们将属性称为 nbr
,将项目组合在一起并定义唯一 nbr
的范围的属性,我们将称为 group
.此 nbr
必须在 [1-N] 范围内,并且 可以 在从外部源导入项目时设置.由于所有项目都必须有一个 nbr
,因此任务就变成了如何跟踪使用了哪些值,以便为手动添加的新项目选择一个免费的 nbr
.
I have database items that, in addition to their primary key, need an index unique to the group in which the items belong. Let's call the property nbr
, and the property that groups items together and defines the scope of unique nbr
:s we'll call group
. This nbr
must be in the [1-N] range, and may be set when items are imported from an external source. Since all items must have a nbr
, the task then becomes how to track which values are used, to enable picking a free nbr
for new items that are added manually.
我正在使用 DynamoDB 和 Redis.我在 nbr
上不能有 DynamoDB 索引.到目前为止,我的想法是使用 Redis 来跟踪哪些数字已用于特定组,以便对于诸如 <MYGROUP>-item-nbrs
之类的 Redis 键,我可以存储所有使用 nbr
:s 并实现逻辑以找到下一个空闲的 nbr
.在使用的nbr
范围内的孔是可以接受的,但是在考虑用完的数字之前应该填充孔.
I'm using DynamoDB and Redis. I cannot have a DynamoDB index on nbr
. The idea I have so far is to use Redis to track which numbers have been used for a particular group, so that for a Redis key such as <MYGROUP>-item-nbrs
I could store all the used nbr
:s and implement logic to find the next free nbr
. Holes in the range of used nbr
is acceptable, but holes should be filled before considering the numbers exhausted.
本质上我想找到最大大小为 N 的稀疏数组的未使用索引.
Essentially I want to find unused indices of a sparse array of max size N.
在 Redis 中存储这些信息以快速找到免费的 nbr
的好结构是什么?到目前为止,我的想法包括:
What would be a good structure for storing this information in Redis to quickly find a free nbr
? My ideas so far include:
所有使用过的 nbr 的单个逗号分隔字符串按排序顺序?为了找到一个空闲的 nbr,发出
GET
命令并解析字符串直到找到一个洞或列表的末尾,将选择的数字插入到字符串中,然后替换整个字符串.当 N 很大时,这似乎非常低效.
A single comma-separated string of all used nbr's in sorted order? To find a free nbr, a
GET
command is issued and the string is parsed until a hole is found or the end of the list, the picked number is inserted into the string and then the entire string is replaced. When N is large, this seems very inefficient.
一个散列,其中每个使用的 nbr
都存储为自己的字段,并使用例如HSCAN
遍历散列的字段以找到一个空闲的 nbr
.当N很大时,HSCAN必须扫描很多字段.
A hash where each used nbr
is stored as its own field, and using e.g. HSCAN
to iterate through the fields of the hash to find a free nbr
. When N is large, the HSCAN must scan a lot of fields.
将我的 nbr
:s 划分为称为 p1-20、p21-40、p41-60、...的字段,每个字段都包含一组已使用的 nbr
:s 仅在该分区内,并且当一个分区耗尽时(没有更多的空闲 nbr
:s),将其完全删除以加快进一步迭代的速度.使用 HSCAN 进行迭代,使用 HSET 启动一个新分区.
Partitioning my nbr
:s into fields called say p1-20, p21-40, p41-60, ..., each containing a sorted set of the used nbr
:s within that partition only, and when a partition is exhausted (no more free nbr
:s), remove it completely to speed up further iteration. Use HSCAN to iterate, and HSET to start a new partition.
存储所有免费的 nbr
而不是所有使用的,并使用排序集和 ZPOPMIN 或常规列表和 LPOP,可能划分为子集.使用所有免费的 nbr
1-N 预填充 Redis 看起来很丑陋.
Storing all free nbr
instead of all used, and using sorted sets and ZPOPMIN or regular lists and LPOP, possibly partitioned into sub-sets. Pre-populating Redis with all free nbr
1-N seems ugly though.
假设 N 的数量级为 65536.
Let's say N is in the magnitude of 65536.
出于性能或其他原因,上述任何解决方案是否更好/更糟?有没有更好/更聪明的方法,也许是利用 Redis 的一些我不知道的聪明方面?
Are any of the solutions above better/worse, for performance or other reasons? Is there a better/smarter way, maybe utilizing some clever aspect of Redis that I'm unaware of?
Kevin 的回答导致了以下解决方案(伪代码):
Kevin's answer resulted in the following solution (pseudo code):
function getFreeNbr() {
while (true) {
send "WATCH numbers"
nbr = send "BITPOS numbers 0"
if nbr < N
send "MULTI"
send "SETBIT numbers $nbr 1"
if send "EXEC" != NULL
return nbr
end if
else
send "UNWATCH numbers"
return -1
end if
}
}
推荐答案
如何使用 位图记录,对于每个可能的nbr
,是否使用该值?
How about using Bitmaps to record, for every possible nbr
, whether or not that value is used?
使用SETBIT
来记录一个值被采用::>
SETBIT key [nbr] 1
要找到免费的 nbr
使用 BITPOS
一个>:
To find a free nbr
use BITPOS
:
BITPOS key 0
为了避免竞争条件,您需要确保 get-and-set 是原子的.[OP 在后续问题中解决了这个问题.]
To avoid race conditions you'll want to make sure your get-and-set is atomic. [The OP addresses this in a follow-up question.]
这将需要很少的内存(65536 个可能的值需要 8K 字节).BITPOS
是 O(n),但这不太可能是一个真正的问题.
This will require very little memory (8K bytes for 65536 possible values). BITPOS
is O(n), but that's unlikely to be a real problem.
这篇关于使用Redis从有限范围内生成唯一ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!