我正在尝试生成 N 组独立随机数.我有一个简单的代码,显示了 3 组 10 个随机数的问题.我注意到即使我使用 tf.set_random_seed 来设置种子,不同运行的结果看起来并不相同.非常感谢任何帮助或评论.

I am trying to generate N sets of independent random numbers. I have a simple code that shows the problem for 3 sets of 10 random numbers. I notice that even though I use the tf.set_random_seed to set the seed, the results of different runs do not look alike. Any help or comments are greatly appreciated.

(py3p6) bash-3.2$ cat test.py
import tensorflow as tf
for i in range(3):
  generate = tf.random_uniform((10,), 0, 10)
  with tf.Session() as sess:
    b = sess.run(generate)


# output :
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[8.559105  3.2390785 6.447526  8.316823  1.6297233 1.4103293 2.647568
 2.954973  6.5975866 7.494894 ]
[2.0277488 6.6134906 0.7579422 4.6359386 6.97507   3.3192968 2.866236
 2.2205782 6.7940736 7.2391043]


[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]
[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128
 7.9785547 8.296125  8.388672 ]

更新 1: 事实上,我将种子初始值设定项放在 for 循环中的原因是因为我想以不同的方式设置它们(例如,将其视为不同的 MCMC 运行).这是我完成这项工作的代码,但我不确定它是否有效.基本上我会在 0 到 2^32-1 之间生成几个随机种子,并在每次运行中更改种子.非常感谢任何可以提高内存/RAM 效率的帮助或评论.

Update 1: Indeed the reason I had put the seed initializer within the for loop, was because I want to set them differently (think of it as for different MCMC runs, for instance). This is my code which does the job but I am not sure if it's efficient. Basically I generate a couple random seeds between 0 and 2^32-1, and change the seed in each run. Any help or comments to make it more memory/RAM efficient are greatly appreciated.

import numpy as np
import tensorflow as tf
global_seed = 42
N_chains = 5
seeds = np.random.randint(0, 4294967295, size=N_chains)

for i in range(N_chains):
    .... some stuff ....
    kernel_initializer = tf.random_normal_initializer(seed=seeds[i])
    .... some stuff
    with tf.Session() as sess:
         .... some stuff .....


在 tensorflow 中,随机操作依赖于两种不同的种子:全局种子,由 tf.set_random_seed 设置,以及操作种子, 作为操作的参数提供.您将在在文档中找到有关它们如何关联的更多详细信息.

In tensorflow, a random operation relies on two different seeds: a global seed, set by tf.set_random_seed, and an operation seed, provided as an argument to the operation. You will find more details on how they relate in the docs.


You have a different seed for each random op because each random op maintains its own internal state for pseudo-random number generation. The reason for having each random generator maintaining its own state is to be robust to change: if they shared the same state, then adding a new random generator somewhere in your graph would change the values produced by all the other generators, defeating the purpose of using a seed.

现在,为什么我们有全局 per-op种子的双重系统?好吧,实际上不需要全局种子.它的存在是为了方便:它允许一次性将所有随机操作种子设置为不同的确定性(如果未知)值,而无需详尽地遍历所有这些值.

Now, why do we have this dual system of global and per-op seeds? Well, actually the global seed is not necessary. It is there for convenience: It allows to set all random op seeds to a different and deterministic (if unknown) value at once, without having to go exhaustively through all of them.


Now when a global seed is set but not the op seed, according to the docs,


更准确地说,提供的种子是在当前图中创建的最后一个操作的 id.因此,全局种子随机操作对图中的变化极其敏感,特别是对在其之前创建的那些变化.

To be more precise, the seed that is provided is the id of the last operation that has been created in the current graph. Consequently, globally-seeded random operation are extremely sensitive to change in the graph, in particular to those created before itself.


import tensorflow as tf
generate = tf.random_uniform(())
with tf.Session() as sess:
  # 0.96046877


Now if we create a node before, the result changes:

import tensorflow as tf
tf.zeros(()) # new op added before
generate = tf.random_uniform(())
with tf.Session() as sess:
  # 0.29252338


If a node is create after however, it does not affect the op seed:

import tensorflow as tf
generate = tf.random_uniform(())
tf.zeros(()) # new op added after
with tf.Session() as sess:
  # 0.96046877


Obviously, as in your case, if you generate several operations, they will have different seeds:

import tensorflow as tf
gen1 = tf.random_uniform(())
gen2 = tf.random_uniform(())
with tf.Session() as sess:
  # 0.96046877
  # 0.85591054

出于好奇,并验证种子只是图中最后使用的 id 的事实,您可以将 gen2 的种子与 gen1

As a curiosity, and to validate the fact that seeds are simply the last used id in the graph, you could align the seed of gen2 to gen1 with

import tensorflow as tf
gen1 = tf.random_uniform(())
# 4 operations seems to be created after seed has been picked
seed = tf.get_default_graph()._last_id - 4
gen2 = tf.random_uniform((), seed=seed)
with tf.Session() as sess:
  # 0.96046877
  # 0.96046877


Obviously though, this should not pass code review.

