问题描述
我最近审查了卷积的有趣实现文字分类.但是我审查过的所有TensorFlow代码都使用如下的随机(未经预训练)嵌入向量:
I've recently reviewed an interesting implementation for convolutional text classification. However all TensorFlow code I've reviewed uses a random (not pre-trained) embedding vectors like the following:
with tf.device('/cpu:0'), tf.name_scope("embedding"):
W = tf.Variable(
tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
name="W")
self.embedded_chars = tf.nn.embedding_lookup(W, self.input_x)
self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)
有人知道如何使用Word2vec或GloVe预训练词嵌入的结果,而不是随机的结果吗?
Does anybody know how to use the results of Word2vec or a GloVe pre-trained word embedding instead of a random one?
推荐答案
在TensorFlow中可以使用几种方法使用预训练的嵌入.假设您将NumPy数组嵌入到名为embedding
的NumPy数组中,其中包含vocab_size
行和embedding_dim
列,并且您想要创建一个张量W
,该张量可用于调用 tf.nn.embedding_lookup()
.
There are a few ways that you can use a pre-trained embedding in TensorFlow. Let's say that you have the embedding in a NumPy array called embedding
, with vocab_size
rows and embedding_dim
columns and you want to create a tensor W
that can be used in a call to tf.nn.embedding_lookup()
.
-
只需将
W
创建为tf.constant()
将embedding
作为其值:
W = tf.constant(embedding, name="W")
这是最简单的方法,但是由于tf.constant()
的值多次存储在内存中,因此内存使用效率不高.由于embedding
可能很大,因此只应将此方法用于玩具示例.
This is the easiest approach, but it is not memory efficient because the value of a tf.constant()
is stored multiple times in memory. Since embedding
can be very large, you should only use this approach for toy examples.
将W
创建为tf.Variable
,并通过 tf.placeholder()
:
Create W
as a tf.Variable
and initialize it from the NumPy array via a tf.placeholder()
:
W = tf.Variable(tf.constant(0.0, shape=[vocab_size, embedding_dim]),
trainable=False, name="W")
embedding_placeholder = tf.placeholder(tf.float32, [vocab_size, embedding_dim])
embedding_init = W.assign(embedding_placeholder)
# ...
sess = tf.Session()
sess.run(embedding_init, feed_dict={embedding_placeholder: embedding})
这避免在图形中存储embedding
的副本,但是它确实需要足够的内存才能一次在内存中保留矩阵的两个副本(一个用于NumPy数组,一个用于tf.Variable
).请注意,我假设您想在训练期间保持嵌入矩阵不变,所以W
是用trainable=False
创建的.
This avoid storing a copy of embedding
in the graph, but it does require enough memory to keep two copies of the matrix in memory at once (one for the NumPy array, and one for the tf.Variable
). Note that I've assumed that you want to hold the embedding matrix constant during training, so W
is created with trainable=False
.
如果将嵌入训练为另一个TensorFlow模型的一部分,则可以使用 tf.train.Saver
从另一个模型的检查点文件中加载值.这意味着嵌入矩阵可以完全绕过Python.按照选项2中的方法创建W
,然后执行以下操作:
If the embedding was trained as part of another TensorFlow model, you can use a tf.train.Saver
to load the value from the other model's checkpoint file. This means that the embedding matrix can bypass Python altogether. Create W
as in option 2, then do the following:
W = tf.Variable(...)
embedding_saver = tf.train.Saver({"name_of_variable_in_other_model": W})
# ...
sess = tf.Session()
embedding_saver.restore(sess, "checkpoint_filename.ckpt")
这篇关于在TensorFlow中使用预训练的单词嵌入(word2vec或Glove)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!