问题描述
为了学习tensorflow,我执行了这个tensorflow官方mnist脚本(
这是一个好问题,因为它为 tf.layers
包装程序的内部结构提供了一些启示。让我们运行两个实验:
- 完全按照问题中的模型运行模型。
- 添加显式名称通过
name
参数进入图层并再次运行。
无图层图'名称
与您的图表相同,但我将其展开并放大到logits密集层。请注意, dense_1
包含图层变量(内核和偏差),而 dense_2
包含操作(矩阵乘法和加法)。
这意味着它仍然还是一层,但是具有两个命名范围- density_1
和 dense_2
。发生这种情况是因为这是第二个密集层,并且第一个已经使用了命名范围 dense
。变量的创建与实际的层逻辑是分开的-有 build
和 call
方法,而且它们都试图获得一个范围的唯一名称。这分别导致 dense_1
和 dense_2
分别包含变量和ops。
指定名称的图形
现在让我们将 name ='logits'
添加到同一层并再次运行:
logits = tf.layers.dense(inputs = dropout,units = 10,name ='logits')
您仍然可以看到2个变量和2个操作,但是该图层设法获得了一个唯一范围的名称( logits
)并将所有内容放入其中。
结论
这是一个很好的例子,为什么张量流中的显式命名是有益的,无论它是关于张量直接或更高层次的层。当模型使用有意义的名称而不是自动生成的名称时,混淆就更少了。
In order to learn tensorflow, I executed this tensorflow official mnist script (cnn_mnist.py) and displayed the graph with tensorboard.
The following is part of the code.This network contains two conv layers and two dense layers.
conv1 = tf.layers.conv2d(inputs=input_layer,filters=32,kernel_size=[5, 5],
padding="same",activation=tf.nn.relu)
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)
conv2 = tf.layers.conv2d(inputs=pool1,filters=64,kernel_size=[5, 5],
padding="same",activation=tf.nn.relu)
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)
pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
dropout = tf.layers.dropout(
inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)
logits = tf.layers.dense(inputs=dropout, units=10)
However, looking at the graph generated by tensorboard, there are three conv layers and three dense layers.I did not expect that conv2d_1
and dense_1
will be generated.
Why was conv2d_1
and dense_1
generated ?
This is a good question, because it sheds some light into inner structure of tf.layers
wrappers. Let's run two experiments:
- Run the model exactly as in the question.
- Add explicit names to the layers via
name
argument and run again.
The graph without layers' names
That's the same graph as your, but I expanded and zoomed in to the logits dense layers. Note that dense_1
contains the layer variables (kernel and bias) and dense_2
contains the ops (matrix multiplication and addition).
This means that this is still one layer, but with two naming scopes - dense_1
and dense_2
. This happens because this is the second dense layer, and the first one already used the naming scope dense
. Variables creation is separated from the actual layer logic - there's build
and call
method, - and they both try to get a unique name for the scope. This leads to dense_1
and dense_2
holding variables and ops respectively.
The graph with names specified
Now let's add name='logits'
to the same layer and run again:
logits = tf.layers.dense(inputs=dropout, units=10, name='logits')
You can see there're still 2 variables and 2 ops, but the layer managed to grab one unique name for the scope (logits
) and put everything inside.
Conclusion
This is a good example why explicit naming in tensorflow is beneficial, no matter if it's about tensors directly or higher-level layer. There is much less confusion, when the model uses meaningful names, instead of automatically generated ones.
这篇关于在Tensorboard的mnist示例中生成了意外的图层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!