能向我解释的人地图和flatMap，什么是一个很好的用例每个区别？

本文介绍了能向我解释的人地图和flatMap，什么是一个很好的用例每个区别？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

能向我解释的人地图和flatMap，什么是一个很好的用例为每？

Can someone explain to me the difference between map and flatMap and what is a good use case for each?

什么是扁平化的结果的手段？
对于什么是好？

What does "flatten the results means?for what is it good?

推荐答案

下面是差异的例子：

val textFile = sc.textFile("README.md") // create an RDD of lines of text

// MAP:

textFile.map(_.length)  // map over the lines:

    res2: Array[Int] = Array(14, 0, 71, 0, 0, ...)

          // -> one length per line

// FLATMAP:

textFile.flatMap(_.split(" "))   // split each line into words:

    res3: Array[String] = Array(#, Apache, Spark, ...) 

          // -> multiple words per line, and multiple lines
          // - but we end up with a single output array

地图变换长度为N的RDD为长度为N另外RDD。

map transforms an RDD of length N into another RDD of length N.

例如，它从N行映射到N个线长度。

For example, it maps from N lines into N line-lengths.

flatMap （严格意义上）变换长度为N的RDD成N个集合的集合，然后变平到这些结果的单一RDD。

flatMap (loosely speaking) transforms an RDD of length N into a collection of N collections, then flattens these into a single RDD of results.

例如，从行的集合flatMapping单词的集合

For example, flatMapping from a collection of lines to a collection of words.

["a b c", "", "d"] => [["a","b","c"],[],["d"]] => ["a","b","c","d"]

的输入和输出RDDS因此将通常具有不同的尺寸。

The input and output RDDs will therefore typically be of different sizes.

这篇关于能向我解释的人地图和flatMap，什么是一个很好的用例每个区别？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

rdd