本文介绍了Java 中 Spark MLlib 中的矩阵运算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是关于 MLlib (Spark 1.2.1+).

This question is about MLlib (Spark 1.2.1+).

操作局部矩阵的最佳方法是什么(中等大小,小于 100x100,因此不需要分发).

What is the best way to manipulate local matrices (moderate size, under 100x100, so does not need to be distributed).

例如,在计算数据集的 SVD 后,我需要执行一些矩阵运算.RowMatrix 只提供乘法函数.toBreeze 方法返回一个 DenseMatrix 但该 API 似乎对 Java 不友好:公开决赛<TT,B,That>那 $plus(B b, UFunc.UImpl2 op)

For instance, after computing the SVD of a dataset, I need to perform some matrix operation.The RowMatrix only provide a multiply function. The toBreeze method returns a DenseMatrix<Object> but the API does not seem Java friendly:public final <TT,B,That> That $plus(B b, UFunc.UImpl2<OpAdd$,TT,B,That> op)

在 Spark+Java 中,如何进行以下任一操作:

In Spark+Java, how to do any of the following operations:

  • 转置矩阵
  • 加/减两个矩阵
  • 裁剪矩阵
  • 执行逐元素操作

Javadoc RowMatrix:https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/distributed/RowMatrix.html

Javadoc RowMatrix: https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/distributed/RowMatrix.html

RDD<Vector> data = ...;
RowMatrix matrix = new RowMatrix(data);
SingularValueDecomposition<RowMatrix, Matrix> svd = matrix.computeSVD(15, true, 1e-9d);

RowMatrix U = svd.U();
Vector s = svd.s();
Matrix V = svd.V();
//Example 1: How to compute transpose(U)*matrix
//Example 2: How to compute transpose(U(:,1:k))*matrix

编辑:感谢 dlwh 为我指出正确的方向,以下解决方案有效:

EDIT: Thanks for dlwh for pointing me in the right direction, the following solution works:

import no.uib.cipr.matrix.DenseMatrix;
// ...
RowMatrix U = svd.U();
DenseMatrix U_mtj = new DenseMatrix((int) U.numCols(), (int) U.numRows(), U.toBreeze().toArray$mcD$sp(), true);
// From there, matrix operations are available on U_mtj

推荐答案

Breeze 只是不提供 Java 友好的 API.(而且,作为主要作者,我没有计划:它会过多地阻碍 API.)

Breeze just doesn't provide a Java-friendly API. (And, speaking as the main author, I have no plans to: it would hamstring the API too much.)

您可能可以利用 MTJ 使用与我们相同的密集矩阵表示这一事实.(嗯,差不多.他们的 API 没有公开 MajorStride,但这对您来说应该不是问题.)

You can probably exploit the fact that MTJ uses the same dense matrix representation as we do. (Well, almost. Their API doesn't expose majorStride, but that shouldn't be an issue for you.)

也就是说,您可以执行以下操作:

That is, you can do something like this:

import no.uib.cipr.matrix.DenseMatrix;

// ...

breeze.linalg.DenseMatrix[Double] Ubreeze = U.toBreeze();
new DenseMatrix(Ubreeze.cols(), Ubreeze.rows(), Ubreeze.data());

这篇关于Java 中 Spark MLlib 中的矩阵运算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:02