本文介绍了在Tensorboard的mnist示例中生成了意外的图层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了学习tensorflow,我执行了这个tensorflow官方mnist脚本(

解决方案

这是一个好问题,因为它为 tf.layers 包装程序的内部结构提供了一些启示。让我们运行两个实验:




  • 完全按照问题中的模型运行模型。

  • 添加显式名称通过 name 参数进入图层并再次运行。



无图层图'名称



与您的图表相同,但我将其展开并放大到logits密集层。请注意, dense_1 包含图层变量(内核和偏差),而 dense_2 包含操作(矩阵乘法和加法)。



这意味着它仍然还是一层,但是具有两个命名范围- density_1 dense_2 。发生这种情况是因为这是第二个密集层,并且第一个已经使用了命名范围 dense 。变量的创建与实际的层逻辑是分开的-有 build call 方法,而且它们都试图获得一个范围的唯一名称。这分别导致 dense_1 dense_2 分别包含变量和ops。





指定名称的图形



现在让我们将 name ='logits'添加到同一层并再次运行:

  logits = tf.layers.dense(inputs = dropout,units = 10,name ='logits')


您仍然可以看到2个变量和2个操作,但是该图层设法获得了一个唯一范围的名称( logits )并将所有内容放入其中。



结论



这是一个很好的例子,为什么张量流中的显式命名是有益的,无论它是关于张量直接或更高层次的层。当模型使用有意义的名称而不是自动生成的名称时,混淆就更少了。


In order to learn tensorflow, I executed this tensorflow official mnist script (cnn_mnist.py) and displayed the graph with tensorboard.

The following is part of the code.This network contains two conv layers and two dense layers.

conv1 = tf.layers.conv2d(inputs=input_layer,filters=32,kernel_size=[5, 5],
      padding="same",activation=tf.nn.relu)

pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

conv2 = tf.layers.conv2d(inputs=pool1,filters=64,kernel_size=[5, 5],
      padding="same",activation=tf.nn.relu)

pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])

dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)

dropout = tf.layers.dropout(
      inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

logits = tf.layers.dense(inputs=dropout, units=10)

However, looking at the graph generated by tensorboard, there are three conv layers and three dense layers.I did not expect that conv2d_1 and dense_1 will be generated.

Why was conv2d_1 and dense_1 generated ?

解决方案

This is a good question, because it sheds some light into inner structure of tf.layers wrappers. Let's run two experiments:

  • Run the model exactly as in the question.
  • Add explicit names to the layers via name argument and run again.

The graph without layers' names

That's the same graph as your, but I expanded and zoomed in to the logits dense layers. Note that dense_1 contains the layer variables (kernel and bias) and dense_2 contains the ops (matrix multiplication and addition).

This means that this is still one layer, but with two naming scopes - dense_1 and dense_2. This happens because this is the second dense layer, and the first one already used the naming scope dense. Variables creation is separated from the actual layer logic - there's build and call method, - and they both try to get a unique name for the scope. This leads to dense_1 and dense_2 holding variables and ops respectively.

The graph with names specified

Now let's add name='logits' to the same layer and run again:

logits = tf.layers.dense(inputs=dropout, units=10, name='logits')

You can see there're still 2 variables and 2 ops, but the layer managed to grab one unique name for the scope (logits) and put everything inside.

Conclusion

This is a good example why explicit naming in tensorflow is beneficial, no matter if it's about tensors directly or higher-level layer. There is much less confusion, when the model uses meaningful names, instead of automatically generated ones.

这篇关于在Tensorboard的mnist示例中生成了意外的图层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 10:59
查看更多