问题描述
我的模型权重(我将它们输出到weights_before.txt
和weights_after.txt
)在训练前后精确地是相同的,即训练没有任何改变,没有拟合.
My model weights (I output them to weights_before.txt
and weights_after.txt
) are precisely the same before and after the training, i.e. the training doesn't change anything, there's no fitting happening.
我的数据看起来像这样(我基本上希望模型预测特征的符号,如果特征为负,则 结果为0,如果为正):
My data look like this (I basically want the model to predict the sign of feature, result is 0 if feature is negative, 1 if positive):
,feature,zerosColumn,result
0,-5,0,0
1,5,0,1
2,-3,0,0
3,5,0,1
4,3,0,1
5,3,0,1
6,-3,0,0
...
我的方法摘要:
- 加载数据.
- 按列将其拆分为
x
(功能)和y
(结果),将这两个行分别拆分为test
和validation
集. - 将这些设置转换为
TimeseriesGenerators
(在这种情况下不是必需的,但我想使此设置正常工作,我看不出为什么不应该这样做). - 创建和编译具有很少
Dense
层并在其输出层上激活softmax
的简单Sequential
模型,使用binary_crossentropy
作为损失函数. - 训练模型... 什么都没发生!
- Load the data.
- Split it column-wise to
x
(feature) andy
(result), split these two row-wise totest
andvalidation
sets. - Transform these sets into
TimeseriesGenerators
(not necessary in this scenario but I want to get this setup working and I don't see any reason why it shouldn't). - Create and compile simple
Sequential
model with fewDense
layers andsoftmax
activation on its output layer, usebinary_crossentropy
as loss function. - Train the model... nothing happens!
完整代码如下:
import keras
import pandas as pd
import numpy as np
np.random.seed(570)
TIMESERIES_LENGTH = 1
TIMESERIES_SAMPLING_RATE = 1
TIMESERIES_BATCH_SIZE = 1024
TEST_SET_RATIO = 0.2 # the portion of total data to be used as test set
VALIDATION_SET_RATIO = 0.2 # the portion of total data to be used as validation set
RESULT_COLUMN_NAME = 'feature'
FEATURE_COLUMN_NAME = 'result'
def create_network(csv_path, save_model):
before_file = open("weights_before.txt", "w")
after_file = open("weights_after.txt", "w")
data = pd.read_csv(csv_path)
data[RESULT_COLUMN_NAME] = data[RESULT_COLUMN_NAME].shift(1)
data = data.dropna()
x = data.ix[:, 1:2]
y = data.ix[:, 3]
test_set_length = int(round(len(x) * TEST_SET_RATIO))
validation_set_length = int(round(len(x) * VALIDATION_SET_RATIO))
x_train_and_val = x[:-test_set_length]
y_train_and_val = y[:-test_set_length]
x_train = x_train_and_val[:-validation_set_length].values
y_train = y_train_and_val[:-validation_set_length].values
x_val = x_train_and_val[-validation_set_length:].values
y_val = y_train_and_val[-validation_set_length:].values
train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_train,
y_train,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
val_gen = keras.preprocessing.sequence.TimeseriesGenerator(
x_val,
y_val,
length=TIMESERIES_LENGTH,
sampling_rate=TIMESERIES_SAMPLING_RATE,
batch_size=TIMESERIES_BATCH_SIZE
)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, activation='relu', input_shape=(TIMESERIES_LENGTH, 1)))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(10, activation='relu'))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(1, activation='softmax'))
for item in model.get_weights():
before_file.write("%s\n" % item)
model.compile(
loss=keras.losses.binary_crossentropy,
optimizer="adam",
metrics=[keras.metrics.binary_accuracy]
)
history = model.fit_generator(
train_gen,
epochs=10,
verbose=1,
validation_data=val_gen
)
for item in model.get_weights():
after_file.write("%s\n" % item)
before_file.close()
after_file.close()
create_network("data/sign_data.csv", False)
你有什么主意吗?
推荐答案
问题是您将softmax
用作最后一层的激活功能.本质上,softmax对其输入进行归一化以使元素的总和为1.因此,如果在只有一个单位的层(即Dense(1,...)
)上使用它,则它将始终输出1.要解决此问题,请将最后一层的激活功能更改为sigmoid
,其输出范围为(0,1)
.
The problem is that you are using softmax
as the activation function of last layer. Essentially, softmax normalizes its input to make the sum of the elements to be one. Therefore, if you use it on a layer with only one unit (i.e. Dense(1,...)
), then it would always output 1. To fix this, change the activation function of last layer to sigmoid
which outputs a value in the range (0,1)
.
这篇关于Keras模型根本不学习的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!