本文介绍了使用 tensorflow_datasets.load (TF 2.1) 拆分训练数据以进行训练和验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行以下 Colab 项目,但是当我想将训练数据拆分为验证和训练部分时,出现此错误:

I'm trying to run the following Colab project, but when I want to split the training data into validation and train parts I get this error:

KeyError: "Invalid split train[:70%]. Available splits are: ['train']"

我使用以下代码:

(training_set, validation_set), dataset_info = tfds.load(
'tf_flowers',
split=['train[:70%]', 'train[70%:]'],
with_info=True,
as_supervised=True,
)

我该如何解决这个错误?

How I can fix this error?

推荐答案

根据 Tensorflow Datasetdocs 现在支持您提出的方法.可以通过将 split 参数传递给 tfds.load 来进行拆分,就像 split="test[:70%]" 一样.

According to the Tensorflow Dataset docs the approach you presented is now supported. Splitting is possible by passing split parameter to tfds.load like so split="test[:70%]".

(training_set, validation_set), dataset_info = tfds.load(
    'tf_flowers',
    split=['train[:70%]', 'train[70%:]'],
    with_info=True,
    as_supervised=True,
)

使用上面的代码,training_set 有 2569 个条目,而 validation_set 有 1101 个.

With the above code the training_set has 2569 entries, while validation_set has 1101.

感谢 Saman 对 API 弃用的评论:
在之前的 Tensorflow 版本中,可以使用现在已弃用的 tfds.Split API:

Thank you Saman for the comment on API deprecation:
In previous Tensorflow version it was possible to use tfds.Split API which is now deprecated:

(training_set, validation_set), dataset_info = tfds.load(
    'tf_flowers',
    split=[
        tfds.Split.TRAIN.subsplit(tfds.percent[:70]),
        tfds.Split.TRAIN.subsplit(tfds.percent[70:])
    ],
    with_info=True,
    as_supervised=True,
)

这篇关于使用 tensorflow_datasets.load (TF 2.1) 拆分训练数据以进行训练和验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-25 07:11