问题描述
我有一个非常简单的 Spark 代码截图,它在 Scala 2.11 上工作并在 2.12 之后停止编译.
I have a very simple snipper of Spark code which was working on Scala 2.11 and stop compiling after 2.12.
import spark.implicits._
val ds = Seq("val").toDF("col1")
ds.foreachPartition(part => {
part.foreach(println)
})
它失败并出现错误:
Error:(22, 12) value foreach is not a member of Object
part.foreach(println)
解决方法是用这样的代码帮助编译器:
The workaround is to help the compiler with such code:
import spark.implicits._
val ds = Seq("val").toDF("col1")
println(ds.getClass)
ds.foreachPartition((part: Iterator[Row]) => {
part.foreach(println)
})
有没有人对为什么编译器不能将 part
推断为 Iterator[Row]
有很好的解释.
Does anyone have a good explanation on why the compiler cannot infer part
as an Iterator[Row]
.
ds
是一个 DataFrame,定义为 type DataFrame = Dataset[Row]
.
ds
is a DataFrame which is defined as type DataFrame = Dataset[Row]
.
foreachPartition
有两个签名:
def foreachPartition(f: Iterator[T] => Unit): Unit
def foreachPartition(func: ForeachPartitionFunction[T]): 单位
感谢您的帮助.
推荐答案
这是为了帮助面临问题的人以及可以采取哪些措施来解决此问题.
This is to help someone facing the issue and workaround of what can be done to get around this issue.
您可以将 Dataframe 转换为 rdd,然后使用 foreachpartition,您将能够编译和构建您的代码.
You can convert Dataframe to rdd and then use foreachpartition and you will be able to compile and build your code.
ds.rdd.foreachPartition(part => {
part.foreach(println)
})
这篇关于Scala 无法推断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!