问题描述
我尝试在我的jupyter笔记本中使用tft.compute_and_apply_vocabulary和tft.tfidf计算tfidf.但是,我总是收到以下错误:
I try to use tft.compute_and_apply_vocabulary and tft.tfidf to compute tfidf in my jupyter notebook. However I always get the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'compute_and_apply_vocabulary/vocabulary/Placeholder' with dtype string
[[node compute_and_apply_vocabulary/vocabulary/Placeholder (defined at C:\Users\secsi\Anaconda3\envs\tf2\lib\site-packages\tensorflow_
,但占位符类型实际上是字符串.
but the placeholder type is actually string.
这是我的代码:
import tensorflow as tf
import tensorflow_transform as tft
with tf.Session() as sess:
documents = [
"a b c d e",
"f g h i j",
"k l m n o",
"p q r s t",
]
documents_tensor = tf.placeholder(tf.string)
tokens = tf.compat.v1.string_split(documents_tensor)
compute_vocab = tft.compute_and_apply_vocabulary(tokens, vocab_filename='vocab.txt')
global_vars_init = tf.global_variables_initializer()
tabel_init = tf.tables_initializer()
sess.run([global_vars_init, tabel_init])
token2ids = sess.run(tfidf, feed_dict={documents_tensor: documents})
print(f"token2ids: {token2ids}")
版本:
- tensorflow:1.14
- tensorflow-transform:0.14
提前谢谢!
推荐答案
我们不能像tft.compute_and_apply_vocabulary
那样直接使用Tensorflow Transform
的操作,与Tensorflow
操作不同,后者可以直接在Session
中使用
We can't use the Operations of Tensorflow Transform
like tft.compute_and_apply_vocabulary
directly, unlike Tensorflow
Operations, which can be used directly in a Session
.
要使用Tensorflow Transform
的操作,我们必须在preprocessing_fn
中运行它们,然后将其传递给tft_beam.AnalyzeAndTransformDataset
.
For us to use the Operations of Tensorflow Transform
, we must run them in a preprocessing_fn
which should be then passed to tft_beam.AnalyzeAndTransformDataset
.
在您的情况下,由于拥有文本数据,因此可以如下所示修改代码:
In your case, as you have Text Data, your code can be modified as shown below:
def preprocessing_fn(inputs):
"""inputs is our dataset"""
documents = inputs['documents']
tokens = tf.compat.v1.string_split(documents)
compute_vocab = tft.compute_and_apply_vocabulary(tokens)
# Add one for the oov bucket created by compute_and_apply_vocabulary.
review_bow_indices, review_weight = tft.tfidf(compute_vocab,
VOCAB_SIZE + 1)
return {
REVIEW_KEY: review_bow_indices,
REVIEW_WEIGHT_KEY: review_weight,
LABEL_KEY: inputs[LABEL_KEY]
}
(transformed_train_data, transformed_metadata), transform_fn =
((train_data, RAW_DATA_METADATA) | 'AnalyzeAndTransform' >>
tft_beam.AnalyzeAndTransformDataset(preprocessing_fn))
您可以在以下示例中引用此链接如何在文本数据集上使用Tensorflow Transform
进行数据预处理(情感分析).
You can refer this Link for an example on how to perform Data Pre-Processing using Tensorflow Transform
on a Text Dataset (Sentiment Analysis).
如果您认为此答案有用,请接受此答案和/或对其进行投票.谢谢.
If you feel this answer is useful, kindly accept this answer and/or up vote it. Thanks.
这篇关于如何正确使用tft.compute_and_apply_vocabulary和tft.tfidf?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!