本文介绍了如何将路径列表传递给 spark.read.load?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我可以通过将多个路径传递给 load
方法来一次加载多个文件,例如
I can load multiple files at once by passing multiple paths to the load
method, e.g.
spark.read
.format("com.databricks.spark.avro")
.load(
"/data/src/entity1/2018-01-01",
"/data/src/entity1/2018-01-12",
"/data/src/entity1/2018-01-14")
我想先准备一个路径列表并将它们传递给 load
方法,但我收到以下编译错误:
I'd like to prepare a list of paths first and pass them to the load
method, but I get the following compilation error:
val paths = Seq(
"/data/src/entity1/2018-01-01",
"/data/src/entity1/2018-01-12",
"/data/src/entity1/2018-01-14")
spark.read.format("com.databricks.spark.avro").load(paths)
<console>:29: error: overloaded method value load with alternatives:
(paths: String*)org.apache.spark.sql.DataFrame <and>
(path: String)org.apache.spark.sql.DataFrame
cannot be applied to (List[String])spark.read.format("com.databricks.spark.avro").load(paths)
为什么?如何将路径列表传递给 load
方法?
Why? How to pass a list of paths to the load
method?
推荐答案
你只需要一个 splat 运算符 (_*
) paths
> 列为
You just need is a splat operator (_*
) the paths
list as
spark.read.format("com.databricks.spark.avro").load(paths: _*)
这篇关于如何将路径列表传递给 spark.read.load?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!