问题描述
我要初始化使用flatMap数据矩阵,这是我的数据:
i want to initialize a matrix using data in flatMap , this is my data:
-4,0,1.0 ### horrible . not-work install dozen scanner umax ofcourse . tech-support everytime call . fresh install work error . crummy product crummy tech-support crummy experience .
2,1,1.0 ### scanner run . grant product run windows . live fact driver windows lose performance . setup program alert support promptly quits . amazon . website product package requirement listing compatible windows .
1,2,1.0 ### conversion kit spare battery total better stick versionand radio blow nimh charger battery . combination operation size nimh battery . motorola kit . rechargable battery available flashlight camera game toy .
-4,3,1.0 ### recieive part autowinder catch keep place sudden break . hold listen music winder wind . extremely frustrated fix pull little hard snap half . flush drain .
这是我的code:
and this is my code:
val spark_context = new SparkContext(conf)
val data = spark_context.textFile(Input)
val Gama=DenseMatrix.zeros[Double](4,2)
var gmmainit = data.flatMap(line => {
val tuple = line.split("###")
val ss = tuple(0)
val re = """^(-?\d+)\s*,\s*(\d+)\s*,\s*(\d+).*$""".r
val re(n1, n2, n3) = ss // pattern match and extract values
if (n1.toInt >= 0) {
Gama(n2.toInt, 0) += 1
}
if (n1.toInt < 0) {
Gama(n2.toInt, 1) += 1
}
})
println(Gama)
,但它并没有改变伽马矩阵
but it doesn't change Gama matrix,
如何修改我的code来解决这个问题?
how can i modify my code to solve this problem?
推荐答案
所有code的起始甚至不会编译。如果你看看在 flatMap
签名:
First of all your code won't even compile. If you take a look at the flatMap
signature:
flatMap[U](f: T => TraversableOnce[U])
你会看到它从 T
映射到 TraversableOnce [U]
。由于 DenseMatrix
收益的
您使用功能类型更新
方法单位字符串=&GT;单位
和单位
不是 TraversableOnce
。
you'll see it maps from T
to TraversableOnce[U]
. Since update
method of DenseMatrix
returns Unit
function you use is of type String => Unit
and Unit
is not TraversableOnce
.
此外,由于已经解释说,每个分区获取引用变量的本地副本关闭和只拷贝被修改。
Moreover, as already explained by Justin, each partition gets its own local copy of the variables referenced in a closure and only that copy is modified.
就可以解决这个问题的方法是这样的:
One way can you solve this problem is something like this:
val gmmainit = data.mapPartitions(iter => {
val re = """^(-?\d+)\s*,\s*(\d+)\s*,\s*(\d+).*$""".r
val gama = DenseMatrix.zeros[Double](4,2)
iter.foreach{
case re(n1, n2, n3) => gama(n2.toInt, if(n1.toInt >= 0) 0 else 1) += 1
case _ =>
}
Iterator(gama)
}).reduce(_ + _)
这篇关于如何分配值到flatMap斯卡拉 - 星火一件轻而易举的矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!