Scipy稀疏CSR矩阵到TensorFlow SparseTensor-最小批量梯度下降

本文介绍了Scipy稀疏CSR矩阵到TensorFlow SparseTensor-最小批量梯度下降的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个Scipy稀疏CSR矩阵，它是由SVM-Light格式的稀疏TF-IDF特征矩阵创建的.功能数量巨大且稀疏，因此我必须使用SparseTensor，否则速度太慢.

I have a Scipy sparse CSR matrix created from sparse TF-IDF feature matrix in SVM-Light format. The number of features is huge and it is sparse so I have to use a SparseTensor or else it is too slow.

例如，特征数量为5，示例文件如下所示:

For example, number of features is 5, and a sample file can look like this:

解析后，训练集如下:

trainX = <scipy CSR matrix>
trainY = np.array( [0,1,00] )

我有两个重要的问题:

1)如何将其有效地转换为SparseTensor(sp_ids，sp_weights)，以便使用查找执行快速乘法(WX): https://www.tensorflow.org/versions/master/api_docs/python/nn.html#embedding_lookup_sparse

1) How I do convert this to a SparseTensor (sp_ids, sp_weights) efficiently so that I perform fast multiplication (W.X) using lookup: https://www.tensorflow.org/versions/master/api_docs/python/nn.html#embedding_lookup_sparse

2)如何在每个时期随机分配数据集，并重新计算sp_id，sp_weights以便我可以为小批量梯度下降提要(feed_dict).

2) How do I randomize the dataset at each epoch and recalculate sp_ids, sp_weights to so that I can feed (feed_dict) for the mini-batch gradient descent.

将非常喜欢简单模型(例如逻辑回归)上的示例代码.该图将如下所示:

Example code on a simple model like logistic regression will be very appreciated. The graph will be like this:

# GRAPH
mul = tf.nn.embedding_lookup_sparse(W, X_sp_ids, X_sp_weights, combiner = "sum")  # W.X
z = tf.add(mul, b) #  W.X + b


cost_op = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(z, y_true))  # this already has built in sigmoid apply
train_op = tf.train.GradientDescentOptimizer(0.05).minimize(cost_op)  # construct optimizer

predict_op = tf.nn.sigmoid(z) # sig(W.X + b)

推荐答案

我可以回答您问题的第一部分.

I can answer the first part of your question.

def convert_sparse_matrix_to_sparse_tensor(X):
    coo = X.tocoo()
    indices = np.mat([coo.row, coo.col]).transpose()
    return tf.SparseTensor(indices, coo.data, coo.shape)

首先，您将矩阵转换为COO格式.然后，您提取索引，值和形状，并将其直接传递给SparseTensor构造函数.

First you convert the matrix to COO format. Then you extract the indices, values, and shape and pass those directly to the SparseTensor constructor.

这篇关于Scipy稀疏CSR矩阵到TensorFlow SparseTensor-最小批量梯度下降的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！