问题描述
我有一个包含类型的名为时间戳的时间的RDD长:
I have an RDD containing a timestamp named time of type long:
root
|-- id: string (nullable = true)
|-- value1: string (nullable = true)
|-- value2: string (nullable = true)
|-- time: long (nullable = true)
|-- type: string (nullable = true)
我想通过组值1,值和时间YYYY-MM-DD。我试图通过组投(时间日期),但后来我得到了以下错误:
I am trying to group by value1, value2 and time as YYYY-MM-DD. I tried to group by cast(time as Date) but then I got the following error:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.RuntimeException: [1.21] failure: ``DECIMAL'' expected but identifier Date found
这是否意味着没有办法按日期?我甚至尝试添加铸件的另一个层面把它作为一个字符串:
Does that mean there is not way to group by a date? I even tried to add another level of casting to have it as a String:
cast(cast(time as Date) as String)
它返回相同的错误。
Which returns the same error.
我读过,我大概可以使用aggregateByKey在RDD但我不知道如何使用它几列和转换,长期为YYYY-MM-DD字符串。我应该如何进行?
I've read that I could use probably aggregateByKey on the RDD but I don't understand how to use it for a few columns and convert that long to a YYYY-MM-DD String. How should I proceed?
推荐答案
我加入这个功能解决了这个问题:
I solved the issue by adding this functions:
def convert( time:Long ) : String = {
val sdf = new java.text.SimpleDateFormat("yyyy-MM-dd")
return sdf.format(new java.util.Date(time))
}
和它注册到sqlContext是这样的:
And registering it into the sqlContext like this:
sqlContext.registerFunction("convert", convert _)
然后我终于可以按日期组:
Then I could finally group by date:
select * from table convert(time)
这篇关于与集团聚集按日期星火SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!