Scala 无法推断 | foreachPartition

本文介绍了Scala 无法推断的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个非常简单的 Spark 代码截图，它在 Scala 2.11 上工作并在 2.12 之后停止编译.

I have a very simple snipper of Spark code which was working on Scala 2.11 and stop compiling after 2.12.

import spark.implicits._
val ds = Seq("val").toDF("col1")

ds.foreachPartition(part => {
  part.foreach(println)
})

它失败并出现错误:

Error:(22, 12) value foreach is not a member of Object
  part.foreach(println)

解决方法是用这样的代码帮助编译器:

The workaround is to help the compiler with such code:

import spark.implicits._
val ds = Seq("val").toDF("col1")
println(ds.getClass)

ds.foreachPartition((part: Iterator[Row]) => {
  part.foreach(println)
})

有没有人对为什么编译器不能将 part 推断为 Iterator[Row] 有很好的解释.

Does anyone have a good explanation on why the compiler cannot infer part as an Iterator[Row].

ds 是一个 DataFrame，定义为 type DataFrame = Dataset[Row].

ds is a DataFrame which is defined as type DataFrame = Dataset[Row].

foreachPartition 有两个签名:

def foreachPartition(f: Iterator[T] => Unit): Unit
def foreachPartition(func: ForeachPartitionFunction[T]): 单位

感谢您的帮助.

推荐答案

这是为了帮助面临问题的人以及可以采取哪些措施来解决此问题.

This is to help someone facing the issue and workaround of what can be done to get around this issue.

您可以将 Dataframe 转换为 rdd，然后使用 foreachpartition，您将能够编译和构建您的代码.

You can convert Dataframe to rdd and then use foreachpartition and you will be able to compile and build your code.

ds.rdd.foreachPartition(part => {
  part.foreach(println)
})

这篇关于Scala 无法推断的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！