本文介绍了错误:对象 xml 不是包 com.databricks.spark 的成员的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 SBT 读取 XML 文件,但在编译时遇到问题.

I am trying to read XML file using SBT but i am facing issue when i compile it.

build.sbt

name:= "First Spark"
version:= "1.0"
organization := "in.goai"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0"
libraryDependencies += "com.databricks" % "spark-avro_2.10" % "2.0.1"
libraryDependencies += "org.scala-lang.modules" %% "scala-xml" % "1.0.2"
resolvers += Resolver.mavenLocal

.scala 文件

package in.goai.spark

import scala.xml._
import com.databricks.spark.xml
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkContext, SparkConf}

object SparkMeApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("First Spark")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    val fileName = args(0)
    val df = sqlContext.read.format("com.databricks.spark.xml").option("rowTag", "book").load("fileName")
    val selectedData = df.select("title", "price")
    val d = selectedData.show
    println(s"$d")

  }
}

当我通过提供sbt package"来编译它时,它显示以下错误

when i compile it by giving "sbt package" it shows bellow error

[error] /home/hadoop/dev/first/src/main/scala/SparkMeApp.scala:4: object xml is not a member of package com.databricks.spark
[error] import com.databricks.spark.xml
[error]        ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 9 s, completed Sep 22, 2017 4:11:19 PM

我是否需要添加任何其他与 xml 相关的 jar 文件?请建议并请提供任何链接,其中提供有关不同文件格式的 jar 文件的信息

Do i need to add any other jar files related to xml? please suggest and please provide me any link which gives information about jar files for different file formats

推荐答案

因为您使用的是 Scala 2.11 和 Spark 2.0,所以在 build.sbt 中,将您的依赖项更改为以下内容:

Because you're using Scala 2.11 and Spark 2.0, in build.sbt, change your dependencies to the following:

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0"
libraryDependencies += "com.databricks" %% "spark-avro" % "3.2.0"
libraryDependencies += "com.databricks" %% "spark-xml" % "0.4.1"
libraryDependencies += "org.scala-lang.modules" %% "scala-xml" % "1.0.6"
  1. spark-avro 版本更改为 3.2.0:https://github.com/databricks/spark-avro#requirements
  2. 添加 "com.databricks" %% "spark-xml" % "0.4.1": https://github.com/databricks/spark-xml#scala-211
  3. scala-xml 版本更改为 1.0.6,即 Scala 2.11 的当前版本:http://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml_2.11
  1. Change the spark-avro version to 3.2.0: https://github.com/databricks/spark-avro#requirements
  2. Add "com.databricks" %% "spark-xml" % "0.4.1": https://github.com/databricks/spark-xml#scala-211
  3. Change the scala-xml version to 1.0.6, the current version for Scala 2.11: http://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml_2.11

在您的代码中,删除以下导入语句:

In your code, delete the following import statement:

import com.databricks.spark.xml

请注意,您的代码实际上并未使用 spark-avroscala-xml 库.如果您不打算使用它们,请从您的 build.sbt(以及代码中的 import scala.xml._ 语句)中删除这些依赖项.

Note that your code doesn't actually use the spark-avro or scala-xml libraries. Remove those dependencies from your build.sbt (and the import scala.xml._ statement from your code) if you're not going to use them.

这篇关于错误:对象 xml 不是包 com.databricks.spark 的成员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 08:44