本文介绍了如何在Spark中显示KeyValueGroupedDataset?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试学习Spark中的数据集.我不知道的一件事是如何显示 KeyValueGroupedDataset
,因为 show
不适用于它.此外, KeyValuGroupedDataSet
的 map
等效于什么?如果有人举一些例子,我将不胜感激.
I am trying to learn datasets in Spark. One thing I can't figure out is how to display a KeyValueGroupedDataset
, as show
doesn't work for it. Also, what is the equivalent of a map
for KeyValuGroupedDataSet
? I will appreciate if someone give some examples.
推荐答案
好的,我从给出的示例中得到了这个主意此处和此处.我在下面举一个我写的简单例子.
OK, I got the idea from examples given here and here. I am giving below a simple example that I've written.
val x = Seq(("a", 36), ("b", 33), ("c", 40), ("a", 38), ("c", 39)).toDS
x: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]
val g = x.groupByKey(_._1)
g: org.apache.spark.sql.KeyValueGroupedDataset[String,(String, Int)] = ...
val z = g.mapGroups{case(k, iter) => (k, iter.map(x => x._2).toArray)}
z: org.apache.spark.sql.Dataset[(String, Array[Int])] = [_1: string, _2: array<int>]
z.show
+---+--------+
| _1| _2|
+---+--------+
| c|[40, 39]|
| b| [33]|
| a|[36, 38]|
+---+--------+
这篇关于如何在Spark中显示KeyValueGroupedDataset?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!