问题描述
我试图将不同大小的图像保存到 tf 记录中.我发现即使图像大小不同,我仍然可以使用 FixedLenFeature
加载它们.
I was trying to save images of different sizes into tf-records. I found that even though the images have different sizes, I can still load them with FixedLenFeature
.
通过检查 FixedLenFeature
和 VarLenFeature
上的文档,我发现区别似乎在于 VarLenFeauture
返回一个稀疏张量.
By checking the docs on FixedLenFeature
and VarLenFeature
, I found the difference seems to be that VarLenFeauture
returns a sparse tensor.
谁能说明一些应该使用FixedLenFeature
或VarLenFeature
的情况?
Could anyone illustrate some situations one should use FixedLenFeature
or VarLenFeature
?
推荐答案
您可以加载图像,可能是因为您使用特征类型 tf.train.BytesList()
保存它们并且整个图像数据是一大列表中的字节值.
You can load images probably beacause you saved them using feature type tf.train.BytesList()
and whole image data is one big byte value inside a list.
如果我是对的,您正在使用 tf.decode_raw
从您从 TFRecord 加载的图像中获取数据.
If I'm right you're using tf.decode_raw
to get the data out of the image you load from TFRecord.
关于示例用例:我使用 VarLenFeature
来保存对象检测任务的数据集:每个图像有可变数量的边界框(等于图像中的对象),因此我需要另一个功能 objects_number
来跟踪对象(和 bbox)的数量.每个边界框本身就是一个包含 4 个浮点坐标的列表
Regarding example use cases:I use VarLenFeature
for saving datasets for object detection task:There's variable amount of bounding boxes per image (equal to object in image) therefore I need another feature objects_number
to track amount of objects (and bboxes).Each bounding box itself is a list of 4 float coordinates
我正在使用以下代码加载它:
I'm using following code to load it:
features = tf.parse_single_example(
serialized_example,
features={
# We know the length of both fields. If not the
# tf.VarLenFeature could be used
'height': tf.FixedLenFeature([], tf.int64),
'width': tf.FixedLenFeature([], tf.int64),
'depth': tf.FixedLenFeature([], tf.int64),
# Label part
'objects_number': tf.FixedLenFeature([], tf.int64),
'bboxes': tf.VarLenFeature(tf.float32),
'labels': tf.VarLenFeature(tf.int64),
# Dense data
'image_raw': tf.FixedLenFeature([],tf.string)
})
# Get metadata
objects_number = tf.cast(features['objects_number'], tf.int32)
height = tf.cast(features['height'], tf.int32)
width = tf.cast(features['width'], tf.int32)
depth = tf.cast(features['depth'], tf.int32)
# Actual data
image_shape = tf.parallel_stack([height, width, depth])
bboxes_shape = tf.parallel_stack([objects_number, 4])
# BBOX data is actually dense convert it to dense tensor
bboxes = tf.sparse_tensor_to_dense(features['bboxes'], default_value=0)
# Since information about shape is lost reshape it
bboxes = tf.reshape(bboxes, bboxes_shape)
image = tf.decode_raw(features['image_raw'], tf.uint8)
image = tf.reshape(image, image_shape)
请注意,image_raw"是固定长度的特征(有一个元素)并保存bytes"类型的值,但是bytes"类型的值本身可以具有可变大小(它是一串字节,并且可以有很多其中的符号).所以image_raw"是一个包含一个bytes"类型元素的列表,它可以非常大.
Notice that "image_raw" is fixed length Feature (has one element) and holds values of type "bytes", however a value of "bytes" type can itself have variable size (its a string of bytes, and can have many symbols within it).So "image_raw" is a list with ONE element of type "bytes", which can be super big.
进一步说明其工作原理:特征是值的列表,这些值具有特定的类型".
To further elaborate on how it works:Features are lists of values, those values have specific "type".
特征的数据类型是张量数据类型的子集,你有:
Datatypes for features are subset of data types for tensors, you have:
- int64(内存中的 64 位空间)
- 字节(根据需要在内存中占用尽可能多的字节)
- float(在内存idk中占用32-64位多少)
您可以在此处查看张量数据类型.
You can check here tensors data types.
所以你可以在没有 VarLenFeatures
的情况下存储可变长度数据(实际上你做得很好),但首先你需要将它转换为字节/字符串特征,然后对其进行解码.这是最常用的方法.
So you can store variable length data without VarLenFeatures
at all (actually well you do it), but first you would need to convert it into bytes/string feature, and then decode it. And this is most common method.
这篇关于Tensorflow VarLenFeature 与 FixedLenFeature的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!