【tf.keras】tf.keras模型复现

keras 构建模型很简单，上手很方便，同时又是 tensorflow 的高级 API，所以学学也挺好。

模型复现在我们的实验中也挺重要的，跑出了一个模型，虽然我们可以将模型的 checkpoint 保存，但再跑一遍，怎么都得不到相同的结果。

用 keras 实现模型，想要能够复现，首先需要设置各个可能的随机过程的 seed，如 np.random.seed(1)，然后代码不要在 GPU 上跑，而是限制在 CPU 上跑。

（当使用 conv2D 层时，似乎在 GPU 上跑没法复现，即使设置 batch_size=1，只在 CPU 上跑才能复现。）

我的 tensorflow+keras 版本：

print(tf.VERSION)    # '1.10.0'

print(tf.keras.__version__)    # '2.1.6-tf'

keras 模型可复现的配置：

import numpy as np

import tensorflow as tf

import random as rn

import os

# run on CPU only, if you want to run code on GPU, you should delete the following line.

os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

os.environ["PYTHONHASHSEED"] = '0'

# The below is necessary for starting Numpy generated random numbers

# in a well-defined initial state.

np.random.seed(42)

# The below is necessary for starting core Python generated random numbers

# in a well-defined state.

rn.seed(12345)

# Force TensorFlow to use single thread.

# Multiple threads are a potential source of non-reproducible results.

# For further details, see: https://stackoverflow.com/questions/42022950/

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1,

                              inter_op_parallelism_threads=1)

from keras import backend as K

# The below tf.set_random_seed() will make random number generation

# in the TensorFlow backend have a well-defined initial state.

# For further details, see:

# https://www.tensorflow.org/api_docs/python/tf/set_random_seed

tf.set_random_seed(1234)

sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)

K.set_session(sess)

# Rest of code follows ...

对于 tensorflow low-level API，即用 tf.variable_scope() 和 tf.get_variable() 自行构建 layers，同样会出现这种问题。

keras 文档对此的解释是：

而 pytorch 是怎么保证可复现：（cudnn中对卷积操作进行了优化，牺牲了精度来换取计算效率。可以看到，下面的代码强制 cudnn 产生确定性的结果，但会牺牲效率。具体参见博客 PyTorch的可重复性问题（如何使实验结果可复现））

from torch.backends import cudnn

cudnn.benchmark = False            # if benchmark=True, deterministic will be False

cudnn.deterministic = True

References

How can I obtain reproducible results using Keras during development? -- Keras Documentation

具有Tensorflow后端的Keras可以随意使用CPU或GPU吗？

PyTorch的可重复性问题（如何使实验结果可复现）-- hyk_1996