本文介绍了当案例类字段保留带有反引号的Java关键字时,spark-submit失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有反引号用于保留关键字.案例类的一个示例如下:
I have backticks used for reserved keyword. One example for the case class is as follows:
case class IPC(
`type`: String,
main: Boolean,
normalized: String,
section:String,
`class`: String,
subClass: String,
group:String,
subGroup: String
)
我声明了sparksession如下:
I have declared the sparksession as follows:
def run(params: SparkApp.Params): Unit ={
val sparkSession = SparkSession.builder.master("local[*]").appName("SparkUsptoParser").getOrCreate()
// val conf = new SparkConf().setAppName("SparkUsptoParser").set("spark.driver.allowMultipleContexts", "true")
val sc = sparkSession.sparkContext
sc.setLogLevel("INFO")
sc.hadoopConfiguration.set("fs.s3a.connection.timeout", "500000")
val (patentParsedRDD, zipPathRDD) = runLocal(sc, params)
logger.info(f"Starting to parse files, appending parquet ${params.outputPath}")
import sparkSession.implicits._
val patentParseDF = patentParsedRDD.toDF().write.mode(SaveMode.Append).parquet(params.outputPath)
logger.info(f"Done parsing and appending parquet")
// save list of processed archive
val logPath = params.outputPath + "/log_%s" format java.time.LocalDate.now.toString
zipPathRDD.coalesce(1).saveAsTextFile(logPath)
logger.info(f"Log file save to $logPath")
}
我正在尝试使用sbt运行jar包.但是,我收到错误消息保留的关键字,不能用作字段名称".
I am trying to run the jar package with sbt. However, I receive the error, "reserved keyword and cannot be used as field name".
使用的命令:
./bin/spark-submit /Users/Projects/uspto-parser/target/scala-2.11/uspto-parser-assembly-0.1.jar
错误:
Exception in thread "main" java.lang.UnsupportedOperationException: `class` is a reserved keyword and cannot be used as field name
- array element class: "usptoparser.IPC"
- field (class: "scala.collection.immutable.List", name: "ipcs")
- root class: "usptoparser.PatentDocument"
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1$$anonfun$8.apply(ScalaReflection.scala:627)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1$$anonfun$8.apply(ScalaReflection.scala:625)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
版本:
sparkVersion := "2.3.0"
sbt.version = 0.13.8
scalaVersion := "2.11.2"
推荐答案
您可以通过使用不是保留的Java关键字的字段名然后使用'as'重命名来解决该问题:
You can work it around by using a field name that is not a reserved Java keyword and then renaming it using 'as':
scala> case class IPC(name: String, `class`: String)
defined class IPC
scala> val x = Seq(IPC("a", "b"), IPC("d", "e")).toDF
java.lang.UnsupportedOperationException: `class` is a reserved keyword and cannot be used as field name
- root class: "IPC"
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor$1$$anonfun$8.apply(ScalaReflection.scala:627)
...
scala> case class IPC(name: String, clazz: String)
defined class IPC
scala> val x = Seq(IPC("a", "b"), IPC("d", "e")).toDF
x: org.apache.spark.sql.DataFrame = [name: string, clazz: string]
scala> x.select($"clazz".as("class")).show(false)
+-----+
|class|
+-----+
|b |
|e |
+-----+
scala>
这篇关于当案例类字段保留带有反引号的Java关键字时,spark-submit失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!