问题描述
我的时间戳为 UTC 和 ISO8601,但使用结构化流,它会自动转换为本地时间.有没有办法阻止这种转换?我想在 UTC 中使用它.
I have my timestamp in UTC and ISO8601, but using Structured Streaming, it gets automatically converted into the local time. Is there a way to stop this conversion? I would like to have it in UTC.
我正在从 Kafka 读取 json 数据,然后使用 from_json
Spark 函数解析它们.
I'm reading json data from Kafka and then parsing them using the from_json
Spark function.
输入:
{"Timestamp":"2015-01-01T00:00:06.222Z"}
流程:
SparkSession
.builder()
.master("local[*]")
.appName("my-app")
.getOrCreate()
.readStream()
.format("kafka")
... //some magic
.writeStream()
.format("console")
.start()
.awaitTermination();
架构:
StructType schema = DataTypes.createStructType(new StructField[] {
DataTypes.createStructField("Timestamp", DataTypes.TimestampType, true),});
输出:
+--------------------+
| Timestamp|
+--------------------+
|2015-01-01 01:00:...|
|2015-01-01 01:00:...|
+--------------------+
如您所见,小时已自行增加.
As you can see, the hour has incremented by itself.
PS:我尝试使用 from_utc_timestamp
Spark 函数进行实验,但没有成功.
PS: I tried to experiment with the from_utc_timestamp
Spark function, but no luck.
推荐答案
对我来说,它可以使用:
For me it worked to use:
spark.conf.set("spark.sql.session.timeZone", "UTC")
它告诉 spark SQL 使用 UTC 作为时间戳的默认时区.例如,我在 spark SQL 中使用了它:
It tells the spark SQL to use UTC as a default timezone for timestamps. I used it in spark SQL for example:
select *, cast('2017-01-01 10:10:10' as timestamp) from someTable
我知道它在 2.0.1 中不起作用.但适用于 Spark 2.2.我也在 SQLTransformer
中使用过,并且它有效.
I know it does not work in 2.0.1. but works in Spark 2.2. I used in SQLTransformer
also and it worked.
虽然我不确定流媒体.
这篇关于Spark Struded Streaming 自动将时间戳转换为本地时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!