问题描述
我遇到这个错误:
。java.lang.ClassCastException:scala.collection.immutable $结肠结肠$不能转换为[Ljava.lang.Object;
每当我尝试使用包含找到一个字符串是否是一个数组里面。是否有这样做的更合适的方式?或者,我是不是做错了什么? (我是相当新的斯卡拉)
I am encountering this error:java.lang.ClassCastException: scala.collection.immutable.$colon$colon cannot be cast to [Ljava.lang.Object;
whenever I try to use "contains" to find if a string is inside an array. Is there a more appropriate way of doing this? Or, am I doing something wrong? (I am fairly new to Scala)
下面是code:
val matches = Set[JSONObject]()
val config = new SparkConf()
val sc = new SparkContext("local", "SparkExample", config)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val ebay = sqlContext.read.json("/Users/thomassquires/Downloads/products.json")
val catalogue = sqlContext.read.json("/Users/thomassquires/Documents/catalogue2.json")
val eins = ebay.map(item => (item.getAs[String]("ID"), Option(item.getAs[Set[Row]]("itemSpecifics"))))
.filter(item => item._2.isDefined)
.map(item => (item._1 , item._2.get.find(x => x.getAs[String]("k") == "EAN")))
.filter(x => x._2.isDefined)
.map(x => (x._1, x._2.get.getAs[String]("v")))
.collect()
def catEins = catalogue.map(r => (r.getAs[String]("_id"), Option(r.getAs[Array[String]]("item_model_number")))).filter(r => r._2.isDefined).map(r => (r._1, r._2.get)).collect()
def matched = for(ein <- eins) yield (ein._1, catEins.filter(z => z._2.contains(ein._2)))
在最后一行时,会出现异常。我已经尝试了一些不同的变种。
The exception occurs on the last line. I have tried a few different variants.
我的数据结构是列表[Tuple2 [字符串,字符串]]
和一个列表[Tuple2 [字符串,数组[字符串]]]
。我需要找到第二个列表包含字符串的零个或更多的匹配。
My data structure is one List[Tuple2[String, String]]
and one List[Tuple2[String, Array[String]]]
. I need to find the zero or more matches from the second list that contain the string.
感谢
推荐答案
长话短说(还有,在这里我摸不透*一部分)你用错误的类型。 getAs
实现为字段索引
(字符串=&GT;诠释
),其次是 GET
(智力=&GT;任何
),其次是 asInstanceOf
。
Long story short (there is still part that eludes me here*) you're using wrong types. getAs
is implemented as fieldIndex
(String => Int
) followed by get
(Int => Any
) followed by asInstanceOf
.
由于星火不使用阵列
也不设置
,但 WrappedArray
存储阵列
列数据,调用,例如 getAs [数组[字符串]]
或 getAs [设置[行]
无效。如果你想具体的类型,您应该使用 getAs [SEQ [T]
或 getAsSeq [T]
和转换您数据所需的类型与 toSet
/ 的toArray
。
Since Spark doesn't use Arrays
nor Sets
but WrappedArray
to store array
column data, calls like getAs[Array[String]]
or getAs[Set[Row]]
are not valid. If you want specific types you should use either getAs[Seq[T]]
or getAsSeq[T]
and convert your data to desired type with toSet
/ toArray
.
*请参阅
这篇关于使用含有斯卡拉 - 例外的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!