本文介绍了如何用graphx求和边缘权重的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Graph [Int,Int],其中每个边都有权重值。对于每个用户,我想要做的就是收集所有边缘,并求和与每个边缘相关的权重。

I have a Graph[Int, Int], where each edge has a weight value. What I want to do is, for each user, to collect all in-edges and sum the weight associated to each of them.

说数据就像:

    import org.apache.spark.graphx._
    val sc: SparkContext
        // Create an RDD for the vertices
        val users: RDD[(VertexId, (String, String))] =
             sc.parallelize(Array((3L, ("rxin", "student")),
                                  (7L,("jgonzal", "postdoc")),
                                  (5L, ("franklin", "prof")),
                                  (2L, ("istoica", "prof"))))
    // Create an RDD for edges
    val relationships: RDD[Edge[Int]] =
         sc.parallelize(Array(Edge(3L, 7L, 12),
                              Edge(5L, 3L, 1),
                              Edge(2L, 5L, 3),
                              Edge(5L, 7L, 5)))

    // Define a default user in case there are relationship with missing user
    val defaultUser = ("John Doe", "Missing")

    // Build the initial Graph
    val graph = Graph(users, relationships, defaultUser)

我的理想结果是具有顶点ID和总重量值...基本上是加权的度数...

My ideal outcome is a data frame with vertices ids and the summed weight value... it is basically a weighted in-degree measure...

id    value
3L    1
5L    3
7L    17
2L    0


推荐答案

val temp = graph.aggregateMessages[int](triplet => {triplet.sendToDst(triplet.attr)},_ + _, TripletFields.EdgeOnly).toDF("id","value")

temp.show()

这篇关于如何用graphx求和边缘权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-14 14:50