问题描述
能向我解释的人地图和flatMap,什么是一个很好的用例为每?
Can someone explain to me the difference between map and flatMap and what is a good use case for each?
什么是扁平化的结果的手段?
对于什么是好?
What does "flatten the results means?for what is it good?
推荐答案
下面是差异的例子:
val textFile = sc.textFile("README.md") // create an RDD of lines of text
// MAP:
textFile.map(_.length) // map over the lines:
res2: Array[Int] = Array(14, 0, 71, 0, 0, ...)
// -> one length per line
// FLATMAP:
textFile.flatMap(_.split(" ")) // split each line into words:
res3: Array[String] = Array(#, Apache, Spark, ...)
// -> multiple words per line, and multiple lines
// - but we end up with a single output array
地图
变换长度为N的RDD为长度为N另外RDD。
map
transforms an RDD of length N into another RDD of length N.
例如,它从N行映射到N个线长度。
For example, it maps from N lines into N line-lengths.
flatMap
(严格意义上)变换长度为N的RDD成N个集合的集合,然后变平到这些结果的单一RDD。
flatMap
(loosely speaking) transforms an RDD of length N into a collection of N collections, then flattens these into a single RDD of results.
例如,从行的集合flatMapping单词的集合
For example, flatMapping from a collection of lines to a collection of words.
["a b c", "", "d"] => [["a","b","c"],[],["d"]] => ["a","b","c","d"]
的输入和输出RDDS因此将通常具有不同的尺寸。
The input and output RDDs will therefore typically be of different sizes.
这篇关于能向我解释的人地图和flatMap,什么是一个很好的用例每个区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!