所以,我想做的就是使用获得的开普勒数据here在系外行星和非系外行星之间进行分类。数据类型是一个时间序列,其维数为( num_of_samples,3197 )。我发现这可以通过在Keras中使用1D卷积层来完成。但是我不断弄乱尺寸并得到以下错误



因此,问题是:

1,是否需要将数据(training_set和test_set)转换为3D张量?如果是,正确的尺寸是多少?

2.正确的输入形状是什么?我知道我对3个功能有3197个时间步长,但是the documentation没有指定它们使用TF还是theano后端,所以我仍然很头疼。

顺便说一句,我正在使用TF后端。任何帮助将不胜感激!谢谢!

"""
Created on Wed May 17 18:23:31 2017

@author: Amajid Sinar
"""

import matplotlib.pyplot as plt
import pandas as pd
plt.style.use("ggplot")
import numpy as np

#Importing training set
training_set = pd.read_csv("exoTrain.csv")
X_train = training_set.iloc[:,1:].values
y_train = training_set.iloc[:,0:1].values

#Importing test set
test_set = pd.read_csv("exoTest.csv")
X_test = test_set.iloc[:,1:].values
y_test = test_set.iloc[:,0:1].values

#Scale the data
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)

#Convert data into 3d tensor
X_train = np.reshape(X_train,(1,X_train.shape[0],X_train.shape[1]))
X_test = np.reshape(X_test,(1,X_test.shape[0],X_test.shape[1]))


#Importing convolutional layers
from keras.models import Sequential
from keras.layers import Convolution1D
from keras.layers import MaxPooling1D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers.normalization import BatchNormalization

#Convolution steps
#1.Convolution
#2.Max Pooling
#3.Flattening
#4.Full Connection

#Initialising the CNN
classifier = Sequential()

#Input shape must be explicitly defined, DO NOT USE (None,shape)!!!
#1.Multiple convolution and max pooling
classifier.add(Convolution1D(filters=8, kernel_size=11, activation="relu", input_shape=(3197,1)))
classifier.add(MaxPooling1D(strides=4))
classifier.add(BatchNormalization())
classifier.add(Convolution1D(filters=16, kernel_size=11, activation='relu'))
classifier.add(MaxPooling1D(strides=4))
classifier.add(BatchNormalization())
classifier.add(Convolution1D(filters=32, kernel_size=11, activation='relu'))
classifier.add(MaxPooling1D(strides=4))
classifier.add(BatchNormalization())
#classifier.add(Convolution1D(filters=64, kernel_size=11, activation='relu'))
#classifier.add(MaxPooling1D(strides=4))


#2.Flattening
classifier.add(Flatten())


#3.Full Connection
classifier.add(Dropout(0.5))
classifier.add(Dense(64, activation='relu'))
classifier.add(Dropout(0.25))
classifier.add(Dense(64, activation='relu'))
classifier.add(Dense(1, activation='sigmoid'))

#Configure the learning process
classifier.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

#Train!
classifier.fit_generator(X_train, steps_per_epoch=X_train.shape[0], epochs=1, validation_data=(X_test,y_test))

score = classifier.evaluate(X_test, y_test)

最佳答案

  • 是的,您的数据集应该是3d张量。
  • (对于tensorflow后端)正确的输入形状是(sample_number,sample_size,channel_number)。您可以从错误消息中检查“预期尺寸为(None,3197,1)”。

  • “无”指的是任意大小的尺寸,因为期望用于训练的样本数量。

    因此,根据您的情况,正确的形状为(570,3197,1)

    如果您碰巧使用theano后端,则应将 channel 尺寸放在首位:
    (sample_number,channel_number,sample_size)或您的特殊情况

    (570,1,3197)

    关于deep-learning - Keras 1D CNN : How to specify dimension correctly?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/44088859/

    10-12 23:06