python - Tensorflow VarLenFeature与FixedLenFeature

我试图将不同尺寸的图像保存到tf记录中。我发现即使图像大小不同，我仍然可以使用FixedLenFeature加载它们。

通过检查FixedLenFeature和VarLenFeature上的文档，我发现差异似乎在于VarLenFeauture返回了稀疏张量。

有人能说明某些情况下应该使用FixedLenFeature或VarLenFeature吗？

最佳答案

您可以加载图像可能是因为使用功能类型tf.train.BytesList()保存了图像，并且整个图像数据是列表内的一个大字节值。

如果我是对的，那么您正在使用tf.decode_raw从TFRecord加载的图像中获取数据。

关于示例用例:
我使用VarLenFeature保存数据集以进行对象检测任务:
每个图像的边界框数量可变(等于图像中的对象)，因此我需要另一个功能objects_number来跟踪对象(和bbox)的数量。
每个边界框本身都是4个 float 坐标的列表

我正在使用以下代码加载它:

features = tf.parse_single_example(
    serialized_example,
    features={
        # We know the length of both fields. If not the
        # tf.VarLenFeature could be used
        'height': tf.FixedLenFeature([], tf.int64),
        'width': tf.FixedLenFeature([], tf.int64),
        'depth': tf.FixedLenFeature([], tf.int64),
        # Label part
        'objects_number': tf.FixedLenFeature([], tf.int64),
        'bboxes': tf.VarLenFeature(tf.float32),
        'labels': tf.VarLenFeature(tf.int64),
        # Dense data
        'image_raw': tf.FixedLenFeature([],tf.string)

    })

# Get metadata
objects_number = tf.cast(features['objects_number'], tf.int32)
height = tf.cast(features['height'], tf.int32)
width = tf.cast(features['width'], tf.int32)
depth = tf.cast(features['depth'], tf.int32)

# Actual data
image_shape = tf.parallel_stack([height, width, depth])
bboxes_shape = tf.parallel_stack([objects_number, 4])

# BBOX data is actually dense convert it to dense tensor
bboxes = tf.sparse_tensor_to_dense(features['bboxes'], default_value=0)

# Since information about shape is lost reshape it
bboxes = tf.reshape(bboxes, bboxes_shape)
image = tf.decode_raw(features['image_raw'], tf.uint8)
image = tf.reshape(image, image_shape)

请注意，“image_raw”是固定长度的Feature(具有一个元素)，并具有“bytes”类型的值，但是“bytes”类型的值本身可以具有可变大小(其为一串字节，并且其中可以包含许多符号) )。
因此，“image_raw”是一个具有“bytes”类型的元素的列表，该列表可能非常大。

要进一步详细说明其工作原理:
功能是值列表，这些值具有特定的“类型”。

功能的数据类型是张量的数据类型的子集，您具有:

int64(内存中的64位空间)

字节(占用所需的内存字节数)

float(在内存IDK中占用32-64位多少)

您可以检查here 张量数据类型。

因此，您可以存储完全没有VarLenFeatures的可变长度数据(实际上做得很好)，但是首先您需要将其转换为字节/字符串功能，然后对其进行解码。
这是最常见的方法。

关于python - Tensorflow VarLenFeature与FixedLenFeature，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/41921746/