Keras模型根本不学习

Keras模型根本不学习

本文介绍了Keras模型根本不学习的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的模型权重(我将它们输出到weights_before.txtweights_after.txt)在训练前后精确地是相同的,即训练没有任何改变,没有拟合.

My model weights (I output them to weights_before.txt and weights_after.txt) are precisely the same before and after the training, i.e. the training doesn't change anything, there's no fitting happening.

我的数据看起来像这样(我基本上希望模型预测特征的符号,如果特征为负,则 结果为0,如果为正):

My data look like this (I basically want the model to predict the sign of feature, result is 0 if feature is negative, 1 if positive):

,feature,zerosColumn,result
0,-5,0,0
1,5,0,1
2,-3,0,0
3,5,0,1
4,3,0,1
5,3,0,1
6,-3,0,0
...

我的方法摘要:

  1. 加载数据.
  2. 按列将其拆分为x(功能)和y(结果),将这两个行分别拆分为testvalidation集.
  3. 将这些设置转换为TimeseriesGenerators(在这种情况下不是必需的,但我想使此设置正常工作,我看不出为什么不应该这样做).
  4. 创建和编译具有很少Dense层并在其输出层上激活softmax的简单Sequential模型,使用binary_crossentropy作为损失函数.
  5. 训练模型... 什么都没发生
  1. Load the data.
  2. Split it column-wise to x (feature) and y (result), split these two row-wise to test and validation sets.
  3. Transform these sets into TimeseriesGenerators (not necessary in this scenario but I want to get this setup working and I don't see any reason why it shouldn't).
  4. Create and compile simple Sequential model with few Dense layers and softmax activation on its output layer, use binary_crossentropy as loss function.
  5. Train the model... nothing happens!

完整代码如下:

import keras
import pandas as pd
import numpy as np

np.random.seed(570)

TIMESERIES_LENGTH = 1
TIMESERIES_SAMPLING_RATE = 1
TIMESERIES_BATCH_SIZE = 1024
TEST_SET_RATIO = 0.2  # the portion of total data to be used as test set
VALIDATION_SET_RATIO = 0.2  # the portion of total data to be used as validation set
RESULT_COLUMN_NAME = 'feature'
FEATURE_COLUMN_NAME = 'result'

def create_network(csv_path, save_model):
    before_file = open("weights_before.txt", "w")
    after_file = open("weights_after.txt", "w")

    data = pd.read_csv(csv_path)

    data[RESULT_COLUMN_NAME] = data[RESULT_COLUMN_NAME].shift(1)
    data = data.dropna()

    x = data.ix[:, 1:2]
    y = data.ix[:, 3]

    test_set_length = int(round(len(x) * TEST_SET_RATIO))
    validation_set_length = int(round(len(x) * VALIDATION_SET_RATIO))

    x_train_and_val = x[:-test_set_length]
    y_train_and_val = y[:-test_set_length]
    x_train = x_train_and_val[:-validation_set_length].values
    y_train = y_train_and_val[:-validation_set_length].values
    x_val = x_train_and_val[-validation_set_length:].values
    y_val = y_train_and_val[-validation_set_length:].values


    train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
        x_train,
        y_train,
        length=TIMESERIES_LENGTH,
        sampling_rate=TIMESERIES_SAMPLING_RATE,
        batch_size=TIMESERIES_BATCH_SIZE
    )

    val_gen = keras.preprocessing.sequence.TimeseriesGenerator(
        x_val,
        y_val,
        length=TIMESERIES_LENGTH,
        sampling_rate=TIMESERIES_SAMPLING_RATE,
        batch_size=TIMESERIES_BATCH_SIZE
    )
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(10, activation='relu', input_shape=(TIMESERIES_LENGTH, 1)))
    model.add(keras.layers.Dropout(0.2))
    model.add(keras.layers.Dense(10, activation='relu'))
    model.add(keras.layers.Dropout(0.2))
    model.add(keras.layers.Flatten())
    model.add(keras.layers.Dense(1, activation='softmax'))

    for item in model.get_weights():
        before_file.write("%s\n" % item)

    model.compile(
        loss=keras.losses.binary_crossentropy,
        optimizer="adam",
        metrics=[keras.metrics.binary_accuracy]
    )

    history = model.fit_generator(
        train_gen,
        epochs=10,
        verbose=1,
        validation_data=val_gen
    )

    for item in model.get_weights():
        after_file.write("%s\n" % item)

    before_file.close()
    after_file.close()

create_network("data/sign_data.csv", False)

你有什么主意吗?

推荐答案

问题是您将softmax用作最后一层的激活功能.本质上,softmax对其输入进行归一化以使元素的总和为1.因此,如果在只有一个单位的层(即Dense(1,...))上使用它,则它将始终输出1.要解决此问题,请将最后一层的激活功能更改为sigmoid,其输出范围为(0,1).

The problem is that you are using softmax as the activation function of last layer. Essentially, softmax normalizes its input to make the sum of the elements to be one. Therefore, if you use it on a layer with only one unit (i.e. Dense(1,...)), then it would always output 1. To fix this, change the activation function of last layer to sigmoid which outputs a value in the range (0,1).

这篇关于Keras模型根本不学习的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-28 22:05