本文介绍了ValueError: `validation_split` 只支持 Tensor 或 NumPy 数组,发现:(keras.preprocessing.sequence.TimeseriesGenerator object)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试在 LSTM 模型中添加 validation_split 时,出现此错误

When I tried to add validation_split in my LSTM model, I got this error

ValueError: `validation_split` is only supported for Tensors or NumPy arrays, found: (<tensorflow.python.keras.preprocessing.sequence.TimeseriesGenerator object)

这是代码

from keras.preprocessing.sequence import TimeseriesGenerator
train_generator = TimeseriesGenerator(df_scaled, df_scaled, length=n_timestamp, batch_size=1)

model.fit(train_generator, epochs=50,verbose=2,callbacks=[tensorboard_callback], validation_split=0.1)

----------
ValueError: `validation_split` is only supported for Tensors or NumPy arrays, found: (<tensorflow.python.keras.preprocessing.sequence.TimeseriesGenerator object)

我能想到的一个原因是,使用 validation_split 需要张量或 numpy 数组,正如错误中所述,但是,当通过 TimeSeriesGenerator 传递火车数据时,它将训练数据的维度更改为 3D 数组
并且由于在使用 LSTM 时必须使用 TimeSeriesGenerator,这是否意味着对于 LSTM 我们不能使用validation_split

One reason I could think of is, to use validation_split a tensor or numpy array is expected, as mentioned in the error, however, when passing train data through TimeSeriesGenerator, it changes the dimension of the train data to a 3D array
And since TimeSeriesGenerator is mandatory to be used when using LSTM, does this means for LSTM we can't use validation_split

推荐答案

您的第一个直觉是正确的,您在使用数据集生成器时不能使用 validation_split.

Your first intution is right that you can't use the validation_split when using dataset generator.

您必须了解 dataset 生成器的功能是如何发生的.model.fit API 不知道您的数据集在其第一个纪元中有多少记录或批次.由于数据是为每个批次一次生成或提供给模型进行训练的.因此,API 无法知道最初有多少记录,然后从中进行验证.由于这个原因,在使用数据集生成器时不能使用 validation_split.您可以在他们的文档中阅读.

You will have to understand how the functioninig of dataset generator happens. The model.fit API does not know how many records or batch your dataset has in its first epoch. As the data is generated or supplied for each batch one at a time to the model for training. So there is no way to for the API to know how many records are initially there and then making a validation set out of it. Due to this reason you cannot use the validation_split when using dataset generator. You can read it in their documentation.

在 0 和 1 之间浮动.要用作的训练数据的分数验证数据.该模型将把这部分训练数据,不会对其进行训练,并将评估损失和每个时期结束时有关此数据的任何模型指标.这从 x 和 y 数据中的最后一个样本中选择验证数据提供,在洗牌之前.当 x 是 a 时,不支持此参数数据集、生成器或 keras.utils.Sequence 实例.

您需要阅读最后两行,他们说数据集生成器不支持它.

You need to read the last two lines where they have said that it is not supported for dataset generator.

您可以做的是使用以下代码来拆分数据集.您可以在此处详细阅读.我只是在写下面链接中的重要部分.

What you can instead do is use the following code to split the dataset. You can read in detail here. I am just writing the important part from the link below.

# Splitting the dataset for training and testing.
def is_test(x, _):
    return x % 4 == 0


def is_train(x, y):
    return not is_test(x, y)


recover = lambda x, y: y

# Split the dataset for training.
test_dataset = dataset.enumerate() \
    .filter(is_test) \
    .map(recover)

# Split the dataset for testing/validation.
train_dataset = dataset.enumerate() \
    .filter(is_train) \
    .map(recover)

希望我的回答对您有所帮助.

I hope my answer helps you.

这篇关于ValueError: `validation_split` 只支持 Tensor 或 NumPy 数组,发现:(keras.preprocessing.sequence.TimeseriesGenerator object)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 09:26