

几个月前,我的任务是为我们的 Web 应用程序实现一个独特的随机代码.代码必须对用户友好且尽可能小,但本质上仍然是随机的(因此用户无法轻易预测序列中的下一个代码).

A few months back I was tasked with implementing a unique and random code for our web application. The code would have to be user friendly and as small as possible, but still be essentially random (so users couldn't easily predict the next code in the sequence).


It ended up generating values that looked something like this:


不幸的是,我从未对实施感到满意.Guid 是不可能的,它们太大了,用户难以输入.我希望有更多的 4 或 5 个字符/数字,但如果我们编码为,我们的特定实现会生成明显的图案序列少于 9 个字符.

Unfortunately, I was never satisfied with the implementation. Guid's were out of the question, they were simply too big and difficult for users to type in. I was hoping for something more along the lines of 4 or 5 characters/digits, but our particular implementation would generate noticeably patterned sequences if we encoded to less than 9 characters.


我们从数据库中提取了一个唯一的连续 32 位 ID.然后我们将它插入到一个 64 位 RANDOM 整数的中心位中.我们创建了一个易于键入和识别的字符的查找表(A-Z、a-z、2-9 跳过了容易混淆的字符,例如 L、l、1、O、0 等).最后,我们使用该查找表对 64 位整数进行 base-54 编码.高位是随机的,低位是随机的,但中心位是连续的.

We pulled a unique sequential 32bit id from the database. We then inserted it into the center bits of a 64bit RANDOM integer. We created a lookup table of easily typed and recognized characters (A-Z, a-z, 2-9 skipping easily confused characters such as L,l,1,O,0, etc.). Finally, we used that lookup table to base-54 encode the 64-bit integer. The high bits were random, the low bits were random, but the center bits were sequential.

最终的结果是一个比 guid 小得多的代码,而且看起来是随机的,尽管它绝对不是.

The final result was a code that was much smaller than a guid and looked random, even though it absolutely wasn't.


I was never satisfied with this particular implementation. What would you guys have done?



我会获得一个常用英语单词列表以及使用频率和一些语法信息(比如它是名词还是动词?).我认为您可以在 intertubes 周围寻找一些副本.Firefox 是开源的,它有一个拼写检查器......所以它必须以某种方式获得.

I'd obtain a list of common English words with usage frequency and some grammatical information (like is it a noun or a verb?). I think you can look around the intertubes for some copy. Firefox is open-source and it has a spellchecker... so it must be obtainable somehow.


Then I'd run a filter on it so obscure words are removed and that words which are too long are excluded.

然后我的生成算法将从列表中选择 2 个单词并将它们连接起来并添加一个随机的 3 位数字.

Then my generation algorithm would pick 2 words from the list and concatenate them and add a random 3 digits number.


I can also randomize word selection pattern between verb/nouns like



the case needn't be camel casing, you can randomize that as well. You can also randomize the placement of the number and the verb/noun.

由于这是很多随机性,Jeff 的 天真的危险 是必读.还要确保提前研究好字典攻击.

And since that's a lot of randomizing, Jeff's The Danger of Naïveté is a must-read. Also make sure to study dictionary attacks well in advance.


And after I'd implemented it, I'd run a test to make sure that my algorithms should never collide. If the collision rate was high, then I'd play with the parameters (amount of nouns used, amount of verbs used, length of random number, total number of words, different kinds of casings etc.)


07-31 09:36