问题描述
我正在尝试使用 SBT 读取 XML 文件,但在编译时遇到问题.
I am trying to read XML file using SBT but i am facing issue when i compile it.
build.sbt
name:= "First Spark"
version:= "1.0"
organization := "in.goai"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0"
libraryDependencies += "com.databricks" % "spark-avro_2.10" % "2.0.1"
libraryDependencies += "org.scala-lang.modules" %% "scala-xml" % "1.0.2"
resolvers += Resolver.mavenLocal
.scala 文件
package in.goai.spark
import scala.xml._
import com.databricks.spark.xml
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkContext, SparkConf}
object SparkMeApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("First Spark")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val fileName = args(0)
val df = sqlContext.read.format("com.databricks.spark.xml").option("rowTag", "book").load("fileName")
val selectedData = df.select("title", "price")
val d = selectedData.show
println(s"$d")
}
}
当我通过提供sbt package"来编译它时,它显示以下错误
when i compile it by giving "sbt package" it shows bellow error
[error] /home/hadoop/dev/first/src/main/scala/SparkMeApp.scala:4: object xml is not a member of package com.databricks.spark
[error] import com.databricks.spark.xml
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 9 s, completed Sep 22, 2017 4:11:19 PM
我是否需要添加任何其他与 xml 相关的 jar 文件?请建议并请提供任何链接,其中提供有关不同文件格式的 jar 文件的信息
Do i need to add any other jar files related to xml? please suggest and please provide me any link which gives information about jar files for different file formats
推荐答案
因为您使用的是 Scala 2.11 和 Spark 2.0,所以在 build.sbt
中,将您的依赖项更改为以下内容:
Because you're using Scala 2.11 and Spark 2.0, in build.sbt
, change your dependencies to the following:
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0"
libraryDependencies += "com.databricks" %% "spark-avro" % "3.2.0"
libraryDependencies += "com.databricks" %% "spark-xml" % "0.4.1"
libraryDependencies += "org.scala-lang.modules" %% "scala-xml" % "1.0.6"
- 将
spark-avro
版本更改为 3.2.0:https://github.com/databricks/spark-avro#requirements - 添加
"com.databricks" %% "spark-xml" % "0.4.1"
: https://github.com/databricks/spark-xml#scala-211 - 将
scala-xml
版本更改为 1.0.6,即 Scala 2.11 的当前版本:http://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml_2.11
- Change the
spark-avro
version to 3.2.0: https://github.com/databricks/spark-avro#requirements - Add
"com.databricks" %% "spark-xml" % "0.4.1"
: https://github.com/databricks/spark-xml#scala-211 - Change the
scala-xml
version to 1.0.6, the current version for Scala 2.11: http://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml_2.11
在您的代码中,删除以下导入语句:
In your code, delete the following import statement:
import com.databricks.spark.xml
请注意,您的代码实际上并未使用 spark-avro
或 scala-xml
库.如果您不打算使用它们,请从您的 build.sbt
(以及代码中的 import scala.xml._
语句)中删除这些依赖项.
Note that your code doesn't actually use the spark-avro
or scala-xml
libraries. Remove those dependencies from your build.sbt
(and the import scala.xml._
statement from your code) if you're not going to use them.
这篇关于错误:对象 xml 不是包 com.databricks.spark 的成员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!