如何使用自动编码器初始化 MLP 的权重 #2nd part - Deep autoencoder #3rd part - Stacked autoencoder

本文介绍了如何使用自动编码器初始化 MLP 的权重 #2nd part - Deep autoencoder #3rd part - Stacked autoencoder的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我构建了一个自动编码器(1 个编码器 8:5，1 个解码器 5:8)，它采用 Pima-Indian-Diabetes 数据集(https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv)和减少其维度(从 8 到 5).我现在想使用这些简化的特征来使用 mlp 对数据进行分类.现在，在这里，我对架构的基本理解有一些问题.如何使用自动编码器的权重并将它们输入 mlp?我检查了这些线程 - https://github.com/keras-team/keras/问题/91 和 https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g.这里的问题是我应该考虑哪个权重矩阵?编码器部分还是解码器部分?当我为 mlp 添加层时，如何使用这些保存的权重初始化权重，而不是获得确切的语法.另外，因为我的缩减维度是 5，所以我的 mlp 应该从 5 个神经元开始吗?这个二元分类问题的 mlp 可能的维数是多少?如果有人可以详细说明吗?

I have built an autoencoder (1 encoder 8:5, 1 decoder 5:8) which takes the Pima-Indian-Diabetes dataset (https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv) and reduces its dimension (from 8 to 5). I would now like to use these reduced features to classify the data using an mlp. Now, here, I have some problems with the basic understanding of the architecture. How do I use the weights of the autoencoder and feed them into the mlp? I have checked these threads - https://github.com/keras-team/keras/issues/91 and https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g. The question here is which weight matrix should I consider? the one for the encoder part or the decoder part? When I add the layers for the mlp, how do I initialise the weights with these saved weights, not getting the exact syntax. Also, should my mlp start with 5 neurons since my reduced dimension is 5? What are the possible dimensions of the mlp for this binary classification problem? If anyone could elaborate please?

深度自编码器代码如下:

The deep autoencoder code is as follows:

# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy

# Data pre-processing...

# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]

# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
                                X, Y, test_size=0.2, random_state=42)

# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)

# Autoencoder code begins here...
encoding_dim1 = 5    # size of encoded representations
encoding_dim2 = 3    # size of encoded representations in the bottleneck layer

# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the first encoded representation of the input
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
# "enc" is the second encoded representation of the input
enc = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
# "dec" is the lossy reconstruction of the input
dec = Dense(encoding_dim1, activation='sigmoid', name='decoder1')(enc)
# "decoded" is the final lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder2')(dec)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)

autoencoder.compile(optimizer='sgd', loss='mse')

# training
autoencoder.fit(x_train, x_train,
            epochs=300,
            batch_size=10,
            shuffle=True,
            validation_data=(x_test, x_test))  # need more tuning

# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)

#The stacked autoencoder code is as follows:

# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy

# Data pre-processing...

# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]

# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
                                X, Y, test_size=0.2, random_state=42)

# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)

# Autoencoder code goes here...
encoding_dim1 = 5    # size of encoded representations
encoding_dim2 = 3    # size of encoded representations in the bottleneck layer

# this is our input placeholder
input_data1 = Input(shape=(8,))
# the first encoded representation of the input
encoded1 = Dense(encoding_dim1, activation='relu',
             name='encoder1')(input_data1)
# the first lossy reconstruction of the input
decoded1 = Dense(8, activation='sigmoid', name='decoder1')(encoded1)
# this model maps an input to its first layer of reconstructions
autoencoder1 = Model(inputs=input_data1, outputs=decoded1)
# this is the first encoder model
enc1 = Model(inputs=input_data1, outputs=encoded1)

autoencoder1.compile(optimizer='sgd', loss='mse')

# training
autoencoder1.fit(x_train, x_train, epochs=300,
             batch_size=10, shuffle=True,
             validation_data=(x_test, x_test))
FirstAEoutput = autoencoder1.predict(x_train)

input_data2 = Input(shape=(encoding_dim1,))
# the second encoded representations of the input
encoded2 = Dense(encoding_dim2, activation='relu',
             name='encoder2')(input_data2)
# the final lossy reconstruction of the input
decoded2 = Dense(encoding_dim1, activation='sigmoid',
             name='decoder2')(encoded2)

# this model maps an input to its second layer of reconstructions
autoencoder2 = Model(inputs=input_data2, outputs=decoded2)

# this is the second encoder
enc2 = Model(inputs=input_data2, outputs=encoded2)

autoencoder2.compile(optimizer='sgd', loss='mse')

# training
autoencoder2.fit(FirstAEoutput, FirstAEoutput, epochs=300,
             batch_size=10, shuffle=True)

# this is the overall autoencoder mapping an input to its final reconstructions
autoencoder = Model(inputs=input_data1, outputs=encoded2)
# test the autoencoder by encoding and decoding the test dataset

reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)

推荐答案

问题太多了.你都尝试了些什么?代码片段?

So many questions. What have you tried so far? Code snippets?

如果您的解码器试图重建输入，那么将您的分类器附加到其输出对我来说真的没有意义.我的意思是，为什么不在第一次将它附加到输入?因此，如果您开始使用自动编码器，那么很明显您应该将分类器连接到编码器管道的输出.

If your decoder is trying to reconstruct the input, then it doesn't really make sense to me to attach your classifier to its output. I mean, why not just attach it to the input in the first time? So if you are set on using an auto-encoder, I'd say it's pretty clear that you should attach your classifier to the output of the encoder pipe.

我不太清楚使用自动编码器的权重并将它们输入 mlp"是什么意思.您不会使用另一层的权重为一个层提供数据，而是使用它的输出信号.这在 Keras 上很容易做到.假设您定义了自动编码器并对其进行了训练:

I'm not quite sure what you mean with "use the weights of the autoencoder and feed them into the mlp". You don't feed a layer with another layer's weights, but with it's output signal. This is pretty easy to do on Keras. Let's say you defined your auto-encoder and trained it as such:

from keras Input, Model
from keras import backend as K
from keras.layers import Dense

x = Input(shape=[8])
y = Dense(5, activation='sigmoid' name='encoder')(x)
y = Dense(8, name='decoder')(y)

ae = Model(inputs=x, outputs=y)
ae.compile(loss='mse', ...)
ae.fit(x_train, x_train, ...)

K.models.save_model(ae, './autoencoder.h5')

然后你可以在编码器上附加一个分类层，并使用以下代码创建一个分类器模型:

Then you can attach a classifying layer at the encoder and create a classifier model with the following code:

# load the model from the disk if you
# are in a different execution.
ae = K.models.load_model('./autoencoder.h5')

y = ae.get_layer('encoder').output
y = Dense(1, activation='sigmoid', name='predictions')(y)

classifier = Model(inputs=ae.inputs, outputs=y)
classifier.compile(loss='binary_crossentropy', ...)
classifier.fit(x_train, y_train, ...)

就是这样，真的.classifier 模型现在将 ae 模型的第一个嵌入层 encoder 作为它的第一层，然后是 sigmoid决策层预测.

That's it, really. The classifier model will now have the first embedding layer encoder of the ae model as its first layer, followed by a sigmoid decision layer predictions.

如果您真正想做的是使用自动编码器学习的权重来初始化分类器的权重(我不肯定我推荐这种方法):

If what you are really trying to do is to use the weights learned by the auto-encoder to initialize the weights from the classifier (I'm not positive I recommend this approach):

您可以使用 layer#get_weights 获取权重矩阵，对其进行修剪(因为编码器有 5 个单元，而分类器只有 1 个)，最后设置分类器权重.以下几行内容:

You can take the weight matrices with layer#get_weights, prune it (because the encoder has 5 units and the classifier only has 1) and finally set the classifier weights. Something in the following lines:

w, b = ae.get_layer('encoder').get_weights()

# remove all units except by one.
neuron_to_keep = 2
w = w[:, neuron_to_keep:neuron_to_keep + 1]
b = b[neuron_to_keep:neuron_to_keep + 1]

classifier.get_layer('predictions').set_weights(w, b)

                        这篇关于如何使用自动编码器初始化 MLP 的权重 #2nd part - Deep autoencoder #3rd part - Stacked autoencoder的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！