问题描述
我正在尝试使用TensorFlow编写一个简单的深度机器学习模型.我使用的是我在Excel中制作的玩具数据集,目的是使模型正常工作并接受数据.我的代码如下:
I am attempting to write a simple deep machine learning model using TensorFlow.I'm using a toy dataset I made up in Excel just to get the model working and accepting data. My code is as follows:
import pandas as pd
import numpy as np
import tensorflow as tf
raw_data = np.genfromtxt('ai/mock-data.csv', delimiter=',', dtype=str)
my_data = np.delete(raw_data, (0), axis=0) #deletes the first row, axis=0 indicates row, axis=1 indicates column
my_data = np.delete(my_data, (0), axis=1) #deletes the first column
policy_state = tf.feature_column.categorical_column_with_vocabulary_list('policy_state', [
'AL', 'CA', 'MI'
])
modern_classic_ind = tf.feature_column.categorical_column_with_vocabulary_list('modern_classic_ind', [
'0', '1'
])
h_plus_ind = tf.feature_column.categorical_column_with_vocabulary_list('h_plus_ind', [
'0', '1'
])
retention_ind = tf.feature_column.categorical_column_with_vocabulary_list('retention_ind', [
'0', '1'
])
feature_columns = [
tf.feature_column.indicator_column(policy_state),
tf.feature_column.indicator_column(modern_classic_ind),
tf.feature_column.indicator_column(h_plus_ind)
]
classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=3,
model_dir="/tmp/ret_model")
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(my_data[:, 0:3], dtype=str)},
y=np.array(np.array(my_data[:, 3], dtype=str)),
num_epochs=None,
shuffle=True)
classifier.train(input_fn=train_input_fn, steps=2000)
不幸的是,我收到以下错误.我尝试过修剪csv文件中的标签,而不是保留标签,将要素列命名为其他内容,并更改numpy数组的类型.错误仍然存在.
Unfortunately, I am getting the following error. I have tried trimming the labels off the csv file versus leaving them, naming the feature columns different things, and changing the type of the numpy array. The error persists.
ValueError:功能h_plus_ind不在功能字典中.
如果我删除 h_plus_ind
,它只会将错误抛出在另一列上.
If I remove h_plus_ind
, it simply throws the error on a different column.
推荐答案
在使用 tf.feature_columns
时,输入input_fn中的数据应具有与先前创建的要素列相同的键.因此,您的 train_input_fn
的 x
应该是字典,其键以 feature_columns
命名.
When using tf.feature_columns
, the data you feed in your input_fn should have the same keys as the feature columns previously created.So, the x
of your train_input_fn
should be a dictionary, with keys named after the feature_columns
.
一个模拟示例:
x = {"policy_state": np.array(['AL','AL','AL','AL','AL']),
"modern_classic_ind": np.array(['0','0','0','0','0']),
"h_plus_ind": np.array(['0','0','0','0','0']),}
在侧面:
来自开发者Google博客的这篇很棒的文章可能是非常好的阅读,因为它介绍了使用 tf.Dataset
API直接从csv文件直接创建 input_fn
的新方法.它具有更好的内存管理,并且避免将所有数据集加载到内存中.
This great article from the developers google blog could be a great read, as it introduces a new way to create input_fn
directly from a csv file with the tf.Dataset
API. It has a better memory management, and avoid loading all the dataset into memory.
这篇关于ValueError:功能不在功能字典中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!