我使用的是tensorflow 1.8.0,python 3.6.5。
数据是iris数据集。代码如下:
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import tensorflow as tf
X = iris['data']
y = iris['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
input_train=tf.estimator.inputs.numpy_input_fn(x=X_train,
y=y_train, num_epochs=100, shuffle=False)
classifier_model = tf.estimator.DNNClassifier(hidden_units=[10,
20, 10], n_classes=3, feature_columns=??)
这是我的问题,如何为numpy矩阵设置feature_列?
如果我将X和y转换为
pandas.DataFrame
,我可以为feature_列使用以下代码,它在DNNClassifier
模型中工作。features = X.columns
feature_columns = [tf.feature_column.numeric_column(key=key) for key in features]
最佳答案
您可以将numpy ndarray包装在字典中,并将其作为输入传递给numpy_input_fn
方法,然后使用字典中的键定义您的x
。还要注意,因为feature_column
中的每个数据都有4个维度,所以在定义X_train
时需要指定shape
参数。下面是完整的代码:
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import tensorflow as tf
iris = load_iris()
X = iris['data']
y = iris['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
input_train = tf.estimator.inputs.numpy_input_fn(
x = {'x': X_train},
y = y_train,
num_epochs = 100,
shuffle = False)
feature_columns = [tf.feature_column.numeric_column(key='x', shape=(X_train.shape[1],))]
classifier_model = tf.estimator.DNNClassifier(
hidden_units=[10, 20, 10],
n_classes=3,
feature_columns=feature_columns)