本文介绍了如何用graphx求和边缘权重的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个Graph [Int,Int],其中每个边都有权重值。对于每个用户,我想要做的就是收集所有边缘,并求和与每个边缘相关的权重。
I have a Graph[Int, Int], where each edge has a weight value. What I want to do is, for each user, to collect all in-edges and sum the weight associated to each of them.
说数据就像:
import org.apache.spark.graphx._
val sc: SparkContext
// Create an RDD for the vertices
val users: RDD[(VertexId, (String, String))] =
sc.parallelize(Array((3L, ("rxin", "student")),
(7L,("jgonzal", "postdoc")),
(5L, ("franklin", "prof")),
(2L, ("istoica", "prof"))))
// Create an RDD for edges
val relationships: RDD[Edge[Int]] =
sc.parallelize(Array(Edge(3L, 7L, 12),
Edge(5L, 3L, 1),
Edge(2L, 5L, 3),
Edge(5L, 7L, 5)))
// Define a default user in case there are relationship with missing user
val defaultUser = ("John Doe", "Missing")
// Build the initial Graph
val graph = Graph(users, relationships, defaultUser)
我的理想结果是具有顶点ID和总重量值...基本上是加权的度数...
My ideal outcome is a data frame with vertices ids and the summed weight value... it is basically a weighted in-degree measure...
id value
3L 1
5L 3
7L 17
2L 0
推荐答案
val temp = graph.aggregateMessages[int](triplet => {triplet.sendToDst(triplet.attr)},_ + _, TripletFields.EdgeOnly).toDF("id","value")
temp.show()
这篇关于如何用graphx求和边缘权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!