问题描述
我使用 Tensorflow 框架进行分类预测.我的数据集包含大约 1160 个输出类.输出类别值为 6 位数字.例如,789954.在使用 Tensorflow 训练和测试数据集后,我得到了大约 99% 的准确率.
I am using Tensorflow framework for my classification predictions. My dataset contains around 1160 output classes. The output class values are 6 digit number. For example, 789954. After training and testing the dataset with Tensorflow, I got the accuracy of around 99%.
现在第二步是在 csv 文件中获取预测结果,以便我可以检查预测结果(logits)是否与集合中的原始标签匹配.我们知道 logits 是我的 .所以,为了解码一个热编码,我做了以下步骤.
Now the second step is to get the prediction outcome in the csv file so that I can check the predicted outcomes(logits) match with original labels in the set. We know that logits are one hot encoded vectors for my . So, I have done the following steps in order to decode the one hot encoded.
prediction=tf.argmax(logits,1)
print(prediction.eval(feed_dict={features : test_features, keep_prob: 1.0}))
prediction = np.asarray(prediction.eval(feed_dict={features : test_features, keep_prob: 1.0}))
prediction = np.reshape(prediction, (test_features.shape[0],1))
np.savetxt("prediction.csv", prediction, delimiter=",")
csv 文件中所有条目的结果值仅为 0.00E+00.但我的期望是各个 csv 条目的 6 位代码.我想我的 one-hot 编码出了点问题.
The resulted values in csv file is only 0.00E+00 for all entries. But my expectation is 6 digit codes for the respective csv entries. I guess I have gone somewhere wrong in my one-hot encoding.
任何帮助都是可观的.
添加:我有一个以这种方式进行的热编码.
Added :I have one hot encoded in this way.
labels = tf.one_hot(labels, n_classes)
并且 n_classes = 1160 并且所有的值都是 6 位数字
And n_classes = 1160 and all the values will be 6 digit number
推荐答案
如果每个描述只有 one-label
那么你的方法就很好.您使用 sklearn LabelEncoder
将您的类别转换为标签.您的标签应该为每个特征设置一个 [0 到 1160]
之间的值,然后进行 on-hot encoding
.
If each description has only one-label
then your approach is fine. You use sklearn LabelEncoder
to convert your categories to labels. Your label should be for each feature a value between [0 to 1160]
and then do a on-hot encoding
.
这篇关于将 Tensorflow 预测导出到 csv 但结果包含全零 - 这是因为一热结束吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!