带有tf-idf稀疏矩阵的Tensorflow DNN

本文介绍了带有tf-idf稀疏矩阵的Tensorflow DNN的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

尝试实现tesorflow DNN进行文本分类.

Trying to implement tesorflow DNN for text classification.

tf-idf稀疏IV:

tf-idf sparse IV:

X_train_sam:
<31819x3122 sparse matrix of type '<class 'numpy.float64'>'with 610128 stored elements in Compressed Sparse Row format>

标签为DV:

y_train_sam.values:array(['mexican', 'mexican', 'italian', ..., 'chinese', 'italian','italian'], dtype=object)

使用以下代码将稀疏转换为张量:

Converting sparse to tensor using following piece:

def convert_sparse_matrix_to_sparse_tensor(X):
    coo = X.tocoo()
    indices = np.mat([coo.row, coo.col]).transpose()
    return tf.SparseTensorValue(indices, coo.data, coo.shape)

 X_train_sam = convert_sparse_matrix_to_sparse_tensor(X_train_sam)

准备建模数据

def train_input_fn(features, labels, batch_size):
    dataset = tf.data.Dataset.from_tensors((features, labels))
    dataset = dataset.shuffle(1000).repeat().batch(batch_size)
    return dataset.make_one_shot_iterator().get_next()

inp = train_input_fn(X_train_sam,y_train_sam.values,batch_size=1000)

应用DNN分类器

classifier = tf.estimator.DNNClassifier(
    feature_columns=[float]*X_train_sam.dense_shape[1],
    hidden_units=[10, 10],
    n_classes=len(y_train_sam.unique()))

classifier.train(input_fn=lambda:inp)

出现以下错误:

ValueError: features should be a dictionary of `Tensor`s. Given type: <class 'tensorflow.python.framework.sparse_tensor.SparseTensorValue'>

请给出一些指示，我是ML和tensorflow的新手.

Please give some pointers, i am new to ML and tensorflow.

推荐答案

如果在此行的代码中

classifier.train(input_fn=lambda:inp)

lambda:inp 应该是字典，或者您是说一个匿名函数?从位于

lambda:inp is supposed to be a dictionary or you mean an anonymous function?From the documentation at

https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier

因此，您需要一个返回元组而不是单个值的函数...

So you need a function that returns a tuple, not a single value...

这篇关于带有tf-idf稀疏矩阵的Tensorflow DNN的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！