我正在开发使用keras进行通用音频标记的系统。

我有以下数据输入:
x_train每个输入都有10个不同的数据(data_leng,max,min等),y_train代表41个可能的标签(吉他,bass等)

x_train shape = (7104, 10)
y_train shape = (41,)

print(x_train[0])

[ 3.75732000e+05 -2.23437546e-05 -1.17187500e-02  1.30615234e-02
  2.65964586e-03  2.65973969e-03  9.80024859e-02  1.13624850e+00
  1.00003528e+00 -1.11458333e+00]

print(y_train[0])

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]



我的模型是:

from keras.models import Sequential
from keras.optimizers import SGD
from keras.layers import Dense, Dropout, Activation

model = Sequential()

model.add(Dense(units=128, activation='relu', input_dim=10))
model.add(Dropout(0.5))
model.add(Dense(units=64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(41, activation='softmax'))

opt = SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True)

model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

model.fit(np.array(x_train), np.array(y_train), epochs=5, batch_size=8)



这是我的结果:

Epoch 1/5
7104/7104 [==============================] - 1s 179us/step - loss: 15.7392 - acc: 0.0235
Epoch 2/5
7104/7104 [==============================] - 1s 132us/step - loss: 15.7369 - acc: 0.0236
Epoch 3/5
7104/7104 [==============================] - 1s 133us/step - loss: 15.7415 - acc: 0.0234
Epoch 4/5
7104/7104 [==============================] - 1s 132us/step - loss: 15.7262 - acc: 0.0242
Epoch 5/5
7104/7104 [==============================] - 1s 132us/step - loss: 15.6484 - acc: 0.0291


如您所见,我的结果显示出很高的数据丢失率和非常低的准确性,但是主要问题是当我尝试预测结果时,导致每个输入的输出都是相同的。我怎样才能解决这个问题 ?


pre = model.predict(np.array(x_train), batch_size=8, verbose=0)

for i in pre:
    print(i)

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
...

最佳答案

在您的密集层中,您仅需要为第一层指定Input_dim。

Keras负责其他层的Dim。

所以尝试:

model = Sequential()

model.add(Dense(units=128, activation='relu', input_dim=10))
model.add(Dropout(0.5))
model.add(Dense(units=64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(41, activation='softmax'))


也许您的正则化对于此类数据而言太强了,请尝试一个强度不那么强或完全没有任何辍学的辍学。

您可以做的最后一件事是提高学习率,从1e-3之类的东西开始,看看是否有所变化。

希望我能帮助你

关于python - Keras:非常低的精度,非常高的损耗,并且每个输入的预测都相同,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56063530/

10-12 21:53