本文介绍了Spark SQL未正确转换时区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
使用Scala 2.10.4和spark 1.5.1和spark 1.6
Using Scala 2.10.4 and spark 1.5.1 and spark 1.6
sqlContext.sql(
"""
|select id,
|to_date(from_utc_timestamp(from_unixtime(at), 'US/Pacific')),
|from_utc_timestamp(from_unixtime(at), 'US/Pacific'),
|from_unixtime(at),
|to_date(from_unixtime(at)),
| at
|from events
| limit 100
""".stripMargin).collect().foreach(println)
火花提交选项:--driver-java-options '-Duser.timezone=US/Pacific'
结果:
[56d2a9573bc4b5c38453eae7,2016-02-28,2016-02-27 16:01:27.0,2016-02-28 08:01:27,2016-02-28,1456646487]
[56d2aa1bfd2460183a571762,2016-02-28,2016-02-27 16:04:43.0,2016-02-28 08:04:43,2016-02-28,1456646683]
[56d2aaa9eb63bbb63456d5b5,2016-02-28,2016-02-27 16:07:05.0,2016-02-28 08:07:05,2016-02-28,1456646825]
[56d2aab15a21fa5f4c4f42a7,2016-02-28,2016-02-27 16:07:13.0,2016-02-28 08:07:13,2016-02-28,1456646833]
[56d2aac8aeeee48b74531af0,2016-02-28,2016-02-27 16:07:36.0,2016-02-28 08:07:36,2016-02-28,1456646856]
[56d2ab1d87fd3f4f72567788,2016-02-28,2016-02-27 16:09:01.0,2016-02-28 08:09:01,2016-02-28,1456646941]
在美国/太平洋地区的时间应为2016-02-28 00:01:27
等,但有些方法是两次将"8"小时相减
The time in US/Pacific should be 2016-02-28 00:01:27
etc but some how it subtracts "8" hours twice
推荐答案
在阅读一段时间后,得出以下结论:
after reading for sometime following are the conclusions:
- Spark-Sql不支持日期时间,也不支持时区
- 使用时间戳是唯一的解决方案
-
from_unixtime(at)
正确解析时间,只是将其打印为字符串会由于时区而改变.可以肯定地假设from_unixtime
将正确地对其进行转换(尽管打印可能会显示不同的结果) -
from_utc_timestamp
会将时间戳转换(不只是转换)到该时区,在这种情况下,它将减去自(-08:00)以来的时间8小时. - 打印sql结果会混淆有关时区参数的时间
- Spark-Sql doesn't support date-time, and nor timezones
- Using timestamp is the only solution
from_unixtime(at)
parses the epoch time correctly, just that the printing of it as a string changes it due to timezone. It is safe to assume that thefrom_unixtime
will convert it correctly ( although printing it might show different results)from_utc_timestamp
will shift ( not just convert) the timestamp to that timezone, in this case it will subtract 8 hours to the time since (-08:00)- printing sql results messes up the times with respect to timezone param
这篇关于Spark SQL未正确转换时区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!