问题描述
我正在尝试使用 tf.data.Dataset 来交错两个数据集,但这样做时遇到了问题.鉴于这个简单的例子:
I'm trying to use tf.data.Dataset to interleave two datasets but having problems doing so. Given this simple example:
ds0 = tf.data.Dataset()
ds0 = ds0.range(0, 10, 2)
ds1 = tf.data.Dataset()
ds1 = ds1.range(1, 10, 2)
dataset = ...
iter = dataset.make_one_shot_iterator()
val = iter.get_next()
什么是 ...
以产生类似 0, 1, 2, 3...9
的输出?
What is ...
to produce an output like 0, 1, 2, 3...9
?
似乎 dataset.interleave() 是相关的,但我无法以不会产生错误的方式制定语句.
It would seem like dataset.interleave() would be relevant but I haven't been able to formulate the statement in a way that doesn't generate an error.
推荐答案
MattScarpino 在 他的评论.您可以使用 Dataset.zip()
以及 Dataset.flat_map()
将多元素数据集展平:
MattScarpino is on the right track in his comment. You can use Dataset.zip()
along with Dataset.flat_map()
to flatten a multi-element dataset:
ds0 = tf.data.Dataset.range(0, 10, 2)
ds1 = tf.data.Dataset.range(1, 10, 2)
# Zip combines an element from each input into a single element, and flat_map
# enables you to map the combined element into two elements, then flattens the
# result.
dataset = tf.data.Dataset.zip((ds0, ds1)).flat_map(
lambda x0, x1: tf.data.Dataset.from_tensors(x0).concatenate(
tf.data.Dataset.from_tensors(x1)))
iter = dataset.make_one_shot_iterator()
val = iter.get_next()
说到这里,你对使用 Dataset.interleave 的直觉()
非常明智.我们正在研究可以让您更轻松地执行此操作的方法.
Having said this, your intuition about using Dataset.interleave()
is pretty sensible. We're investigating ways that you can do this more easily.
附注.作为替代方案,如果您更改 ds0
和 ds1
的方式,可以使用 Dataset.interleave()
来解决问题> 定义:
PS. As an alternative, you can use Dataset.interleave()
to solve the problem if you change how ds0
and ds1
are defined:
dataset = tf.data.Dataset.range(2).interleave(
lambda x: tf.data.Dataset.range(x, 10, 2), cycle_length=2, block_length=1)
这篇关于交错 tf.data.Datasets的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!