在过去的几天里,我一直遇到将数据序列化为tfrecord格式,然后使用parse_single_sequence示例进行反序列化的问题。我正在尝试检索供标准RNN模型使用的数据,但这是我第一次尝试使用tfrecords格式及其附带的关联管道。

这是一个玩具示例,可以重现我遇到的问题:

import tensorflow as tf
import tempfile
from IPython import embed

sequences = [[1, 2, 3], [4, 5, 1], [1, 2]]
label_sequences = [[0, 1, 0], [1, 0, 0], [1, 1]]

def make_example(sequence, labels):

    ex = tf.train.SequenceExample()

    sequence_length = len(sequence)
    ex.context.feature["length"].int64_list.value.append(sequence_length)

    fl_tokens = ex.feature_lists.feature_list["tokens"]
    fl_labels = ex.feature_lists.feature_list["labels"]
    for token, label in zip(sequence, labels):
        fl_tokens.feature.add().int64_list.value.append(token)
        fl_labels.feature.add().int64_list.value.append(label)
    return ex


writer = tf.python_io.TFRecordWriter('./test.tfrecords')
for sequence, label_sequence in zip(sequences, label_sequences):
    ex = make_example(sequence, label_sequence)
    writer.write(ex.SerializeToString())
writer.close()

tf.reset_default_graph()

file_name_queue = tf.train.string_input_producer(['./test.tfrecords'], num_epochs=None)

reader = tf.TFRecordReader()



context_features = {
    "length": tf.FixedLenFeature([], dtype=tf.int64)
}
sequence_features = {
    "tokens": tf.FixedLenSequenceFeature([], dtype=tf.int64),
    "labels": tf.FixedLenSequenceFeature([], dtype=tf.int64)
}

ex = reader.read(file_name_queue)

# Parse the example (returns a dictionary of tensors)
context_parsed, sequence_parsed = tf.parse_single_sequence_example(
    serialized=ex,
    context_features=context_features,
    sequence_features=sequence_features
)


context = tf.contrib.learn.run_n(context_parsed, n=1, feed_dict=None)
print(context[0])
sequence = tf.contrib.learn.run_n(sequence_parsed, n=1, feed_dict=None)
print(sequence[0])

关联的堆栈跟踪为:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/common_shapes.py", line 594, in call_cpp_shape_fn
status)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors.py", line 463, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.InvalidArgumentError: Shape must be rank 0 but is rank 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "my_test.py", line 51, in
sequence_features=sequence_features
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/parsing_ops.py", line 640, in parse_single_sequence_example
feature_list_dense_defaults, example_name, name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/parsing_ops.py", line 837, in _parse_single_sequence_example_raw
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_parsing_ops.py", line 285, in _parse_single_sequence_example
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2382, in create_op
set_shapes_for_outputs(ret)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1783, in set_shapes_for_outputs
shapes = shape_func(op)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/common_shapes.py", line 596, in call_cpp_shape_fn
raise ValueError(err.message)
ValueError: Shape must be rank 0 but is rank 1

我将其发布为github上的一个潜在问题,尽管似乎我可能使用不正确:Tensorflow Github Issue
因此,在没有背景信息的情况下,我只是想知道我是否在实际上出错了?朝着正确方向提供的任何帮助将不胜感激,这已经过去了几天,我的四处寻觅并没有得到解决。谢谢大家!

最佳答案

明白了,对我而言这是一个错误的假设。当我以为tf.TFRecordReader.read(queue, name=None)返回一个元组时,我假设它只会返回不直接传递给示例解析器的(key, value)值。

关于tensorflow - 形状必须为0级,但必须为1级,parse_single_sequence_example,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40275774/

10-12 18:03