（原）tensorflow中函数执行完毕，显存不自动释放

转载请注明出处：

http://www.cnblogs.com/darkknightzh/p/7608916.html

参考网址：

https://stackoverflow.com/questions/39758094/clearing-tensorflow-gpu-memory-after-model-execution

https://github.com/tensorflow/tensorflow/issues/1727#issuecomment-285815312s

tensorflow中，在一个函数内配置完GPU，tf分配了显存，等函数执行完，显存不会释放（貌似torch7中也一样。。。）。第二个参考网址指出：

As for the original problem, currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down. Even if a second session chooses a different GPUOptions, it would not take effect.

第一个session对GPU初始化后，即便释放了显存，第二个sess使用不同的GPU选项来初始化GPU，也不会起效。

第一个网址Oli Blum指出，use processes and shut them down after the computation才能释放显存。具体代码如下（可以参考第一个网址）：

 import tensorflow as tf

 import multiprocessing

 import numpy as np

 def run_tensorflow():

     n_input = 10000

     n_classes = 1000

     # Create model

     def multilayer_perceptron(x, weight):

         # Hidden layer with RELU activation

         layer_1 = tf.matmul(x, weight)

         return layer_1

     # Store layers weight & bias

     weights = tf.Variable(tf.random_normal([n_input, n_classes]))

     x = tf.placeholder("float", [None, n_input])

     y = tf.placeholder("float", [None, n_classes])

     pred = multilayer_perceptron(x, weights)

     cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))

     optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

     init = tf.global_variables_initializer()

     with tf.Session() as sess:

         sess.run(init)

         for i in range(100):

             batch_x = np.random.rand(10, 10000)

             batch_y = np.random.rand(10, 1000)

             sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})

     print "finished doing stuff with tensorflow!"

 if __name__ == "__main__":

     # option 1: execute code with extra process

     p = multiprocessing.Process(target=run_tensorflow)

     p.start()

     p.join()

     # wait until user presses enter key

     raw_input()

     # option 2: just execute the function

     run_tensorflow()

     # wait until user presses enter key

     raw_input()

使用multiprocessing.Process运行run_tensorflow后，显存会自动释放，但是如果直接执行run_tensorflow，显存不会自动释放。当然，该函数计算量较小，如果显卡太好，可能看不到运行multiprocessing.Process后，显存分配、计算并释放的过程，感觉就像没有运行一样。。。