本文介绍了Tensorflow VarLenFeature 与 FixedLenFeature的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将不同大小的图像保存到 tf 记录中.我发现即使图像大小不同,我仍然可以使用 FixedLenFeature 加载它们.

I was trying to save images of different sizes into tf-records. I found that even though the images have different sizes, I can still load them with FixedLenFeature.

通过检查 FixedLenFeatureVarLenFeature 上的文档,我发现区别似乎在于 VarLenFeauture 返回一个稀疏张量.

By checking the docs on FixedLenFeature and VarLenFeature, I found the difference seems to be that VarLenFeauture returns a sparse tensor.

谁能说明一些应该使用FixedLenFeatureVarLenFeature 的情况?

Could anyone illustrate some situations one should use FixedLenFeature or VarLenFeature?

推荐答案

您可以加载图像,可能是因为您使用特征类型 tf.train.BytesList() 保存它们并且整个图像数据是一大列表中的字节值.

You can load images probably beacause you saved them using feature type tf.train.BytesList() and whole image data is one big byte value inside a list.

如果我是对的,您正在使用 tf.decode_raw 从您从 TFRecord 加载的图像中获取数据.

If I'm right you're using tf.decode_raw to get the data out of the image you load from TFRecord.

关于示例用例:我使用 VarLenFeature 来保存对象检测任务的数据集:每个图像有可变数量的边界框(等于图像中的对象),因此我需要另一个功能 objects_number 来跟踪对象(和 bbox)的数量.每个边界框本身就是一个包含 4 个浮点坐标的列表

Regarding example use cases:I use VarLenFeature for saving datasets for object detection task:There's variable amount of bounding boxes per image (equal to object in image) therefore I need another feature objects_number to track amount of objects (and bboxes).Each bounding box itself is a list of 4 float coordinates

我正在使用以下代码加载它:

I'm using following code to load it:

features = tf.parse_single_example(
    serialized_example,
    features={
        # We know the length of both fields. If not the
        # tf.VarLenFeature could be used
        'height': tf.FixedLenFeature([], tf.int64),
        'width': tf.FixedLenFeature([], tf.int64),
        'depth': tf.FixedLenFeature([], tf.int64),
        # Label part
        'objects_number': tf.FixedLenFeature([], tf.int64),
        'bboxes': tf.VarLenFeature(tf.float32),
        'labels': tf.VarLenFeature(tf.int64),
        # Dense data
        'image_raw': tf.FixedLenFeature([],tf.string)

    })

# Get metadata
objects_number = tf.cast(features['objects_number'], tf.int32)
height = tf.cast(features['height'], tf.int32)
width = tf.cast(features['width'], tf.int32)
depth = tf.cast(features['depth'], tf.int32)

# Actual data
image_shape = tf.parallel_stack([height, width, depth])
bboxes_shape = tf.parallel_stack([objects_number, 4])

# BBOX data is actually dense convert it to dense tensor
bboxes = tf.sparse_tensor_to_dense(features['bboxes'], default_value=0)

# Since information about shape is lost reshape it
bboxes = tf.reshape(bboxes, bboxes_shape)
image = tf.decode_raw(features['image_raw'], tf.uint8)
image = tf.reshape(image, image_shape)

请注意,image_raw"是固定长度的特征(有一个元素)并保存bytes"类型的值,但是bytes"类型的值本身可以具有可变大小(它是一串字节,并且可以有很多其中的符号).所以image_raw"是一个包含一个bytes"类型元素的列表,它可以非常大.

Notice that "image_raw" is fixed length Feature (has one element) and holds values of type "bytes", however a value of "bytes" type can itself have variable size (its a string of bytes, and can have many symbols within it).So "image_raw" is a list with ONE element of type "bytes", which can be super big.

进一步说明其工作原理:特征是值的列表,这些值具有特定的类型".

To further elaborate on how it works:Features are lists of values, those values have specific "type".

特征的数据类型是张量数据类型的子集,你有:

Datatypes for features are subset of data types for tensors, you have:

  • int64(内存中的 64 位空间)
  • 字节(根据需要在内存中占用尽可能多的字节)
  • float(在内存idk中占用32-64位多少)

您可以在此处查看张量数据类型.

You can check here tensors data types.

所以你可以在没有 VarLenFeatures 的情况下存储可变长度数据(实际上你做得很好),但首先你需要将它转换为字节/字符串特征,然后对其进行解码.这是最常用的方法.

So you can store variable length data without VarLenFeatures at all (actually well you do it), but first you would need to convert it into bytes/string feature, and then decode it. And this is most common method.

这篇关于Tensorflow VarLenFeature 与 FixedLenFeature的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 07:16