本文介绍了如何获取tf.data.dataset的形状?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道数据集具有output_shapes,但如下所示:

I know dataset has output_shapes, but it shows like below:

如何获取我的数据总数?

How can I get the total number of my data?

推荐答案

知道长度的地方,您可以致电:

Where the length is known you can call:

tf.data.experimental.cardinality(dataset)

但如果失败,那么很重要的一点是,要知道(通常)对TensorFlow Dataset进行了惰性计算,因此这意味着在通常情况下,我们可能需要遍历每条记录,然后才能找到长度数据集.

but if this fails then, it's important to know that a TensorFlow Dataset is (in general) lazily evaluated so this means that in the general case we may need to iterate over every record before we can find the length of the dataset.

例如,假设您启用了急切的执行功能,并且它的一个很小的玩具"数据集非常适合内存,则可以将它enumerate放入一个新列表中并获取最后一个索引(然后添加1,因为列表为零,索引):

For example, assuming you have eager execution enabled and its a small 'toy' dataset that fits comfortably in memory you could just enumerate it into a new list and grab the last index (then add 1 because lists are zero-indexed):

dataset_length = [i for i,_ in enumerate(dataset)][-1] + 1

当然,这充其量是无效的,并且对于大型数据集,将完全失败,因为所有内容都需要放入列表的内存中.在这种情况下,除了遍历保持手动计数的记录外,我看不到其他任何选择.

Of course this is inefficient at best and, for large datasets, will fail entirely because everything needs to fit into memory for the list. in such circumstances I can't see any alternative other than to iterate through the records keeping a manual count.

这篇关于如何获取tf.data.dataset的形状?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 09:55