在 Python 中生成非重复随机数

本文介绍了在 Python 中生成非重复随机数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

好的，这是一个比听起来更棘手的问题，所以我转向堆栈溢出，因为我想不出一个好的答案.这就是我想要的:我需要 Python 生成一个简单的数字列表，从 0 到 1,000,000,000，以随机顺序用于序列号(使用随机数，这样您就无法知道分配了多少或进行计时攻击一样容易，即猜测将出现的下一个).这些数字与链接到它们的信息一起存储在数据库表(索引)中.生成它们的程序不会永远运行，因此它不能依赖于内部状态.

Ok this is one of those trickier than it sounds questions so I'm turning to stack overflow because I can't think of a good answer. Here is what I want: I need Python to generate a simple a list of numbers from 0 to 1,000,000,000 in random order to be used for serial numbers (using a random number so that you can't tell how many have been assigned or do timing attacks as easily, i.e. guessing the next one that will come up). These numbers are stored in a database table (indexed) along with the information linked to them. The program generating them doesn't run forever so it can't rely on internal state.

没什么大不了的吧?只需生成一个数字列表，将它们放入一个数组中并使用 Pythonrandom.shuffle(big_number_array)"，我们就完成了.问题是我想避免存储数字列表(从而读取文件，从顶部弹出一个，保存文件并关闭它).我宁愿即时生成它们.问题是我能想到的解决方案都有问题:

No big deal right? Just generate a list of numbers, shove them into an array and use Python "random.shuffle(big_number_array)" and we're done. Problem is I'd like to avoid having to store a list of numbers (and thus read the file, pop one off the top, save the file and close it). I'd rather generate them on the fly. Problem is that the solutions I can think of have problems:

1) 生成一个随机数，然后检查它是否已经被使用过.如果它已被使用生成一个新号码，检查，根据需要重复，直到我找到一个未使用的号码.这里的问题是我可能会倒霉，在得到一个未使用的数字之前生成了很多使用过的数字.可能的解决方法:使用非常大的数字池来减少这种可能性(但我最终得到了愚蠢的长数字).

1) Generate a random number and then check if it has already been used. If it has been used generate a new number, check, repeat as needed until I find an unused one. Problem here is that I may get unlucky and generate a lot of used numbers before getting one that is unused. Possible fix: use a very large pool of numbers to reduce the chances of this (but then I end up with silly long numbers).

2) 生成一个随机数，然后检查它是否已经被使用过.如果它已被使用，从数字中加减一并再次检查，继续重复直到我找到一个未使用的数字.问题是这不再是一个随机数，因为我引入了偏差(最终我会得到一堆数字，你将能够以更好的成功机会预测下一个数字).

2) Generate a random number and then check if it has already been used. If it has been used add or subtract one from the number and check again, keep repeating until I hit an unused number. Problem is this is no longer a random number as I have introduced bias (eventually I will get clumps of numbers and you'd be able to predict the next number with a better chance of success).

3) 生成一个随机数，然后检查它是否已经被使用过.如果已使用它添加或减去另一个随机生成的随机数并再次检查，问题是我们回到简单地生成随机数并检查如解决方案 1.

3) Generate a random number and then check if it has already been used. If it has been used add or subtract another randomly generated random number and check again, problem is we're back to simply generating random numbers and checking as in solution 1.

4) 提取它并生成随机列表并保存它，让守护进程将它们放入队列中，以便有可用的数字(并避免不断打开和关闭文件，而是对其进行批处理).

4) Suck it up and generate the random list and save it, have a daemon put them into a Queue so there are numbers available (and avoid constantly opening and closing a file, batching it instead).

5) 生成更大的随机数并对它们进行散列(即使用 MD5)以获得更小的数值，我们应该很少发生冲突，但我最终会再次得到大于所需的数字.

5) Generate much larger random numbers and hash them (i.e. using MD5) to get a smaller numeric value, we should rarely get collisions, but I end up with larger than needed numbers again.

6) 将基于时间的信息预先或附加到随机数(即 unix 时间戳)以减少发生冲突的机会，我再次得到比我需要的更大的数字.

6) Prepend or append time based information to the random number (i.e. unix timestamp) to reduce chances of a collision, again I get larger numbers than I need.

任何人都有任何聪明的想法可以减少碰撞"的机会(即生成一个已经被采用的随机数)，但也可以让我保持数字小"(即小于十亿(或为您的欧洲人带来 1 亿美元 =)).

Anyone have any clever ideas that will reduce the chances of a "collision" (i.e. generating a random number that is already taken) but will also allow me to keep the number "small" (i.e. less than a billion (or a thousand million for your europeans =)).

答案以及我接受它的原因:

Answer and why I accepted it:

所以我将简单地使用 1，并希望这不是问题，但是如果是，我将使用生成所有数字并存储它们的确定性解决方案，以便保证获得新的随机数，我可以使用小"数字(即 9 位数字而不是 MD5/等).

So I will simply go with 1, and hope it's not an issue, however if it is I will go with the deterministic solution of generating all the numbers and storing them so that there is a guarentee of getting a new random number, and I can use "small" numbers (i.e. 9 digits instead of an MD5/etc.).

generating

在 Python 中生成非重复随机数

问题描述

推荐答案