Cassandra脚本将日期更改为即时类型

Cassandra脚本将日期更改为即时类型

本文介绍了Cassandra脚本将日期更改为即时类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以更改cassandra中的所有数据值来自: 2020-05-18T14:18:45.878Z 1593402243336 (例如Instant Java类型)

Is it possible to change all data values in cassandrafrom: 2020-05-18T14:18:45.878Zto 1593402243336 (like Instant Java type)

此列中的所有数据均为 text

all data in this column are of type text

我想知道如何编写一个将日期从例如 2020-05-18T14:18:45.878Z 更改为 1593402243336

I wonder how to write a script that changes dates from for example 2020-05-18T14:18:45.878Z to 1593402243336

推荐答案

在Cassandra中,有一个单独的 timestamp 类型来保存此类信息.在内部,它将数据存储为8字节长的值,表示以毫秒为单位的时间.可通过驱动程序访问此值,并且可以将其转换为特定于所用编程语言的类型的值.如果要通过 cqlsh 访问这些值,则需要将它们打印为 2020-05-18T14:18:45.878Z ,但实际上它仍然是 long在引擎盖下输入.

In Cassandra, there is a separate timestamp type to keep such information. Internally, it stores the data as 8-byte long value representing time in milliseconds. This value is accessed via driver, and could be transformed into the value of type specific to the programming languages used. If you're accessing these values via cqlsh you will need them printed as 2020-05-18T14:18:45.878Z, but in reality it's the still long type under the hood.

要执行这种转换,您需要做两件事:

To perform such conversion you need 2 things:

  1. 您需要添加另一种具有 timestamp 类型的列-您无法更改现有列的类型
  2. 您需要使用一些工具来执行这种转换,但这实际上取决于您的要求.您可以这样做,例如:
  1. You need to add another column with timestamp type - you can't change type of the existing column
  2. You need to use some tool to perform such conversion, but this is really depends on your requirements. You can do it, for example:

  • Spark-使用 Spark Cassandra Connector
  • val data = { spark.read.format("org.apache.spark.sql.cassandra")
        .options(Map( "table" -> "", "keyspace" -> ""))
        .load().withColumnRenamed("text_column", "date_column")}
    data.write.format("org.apache.spark.sql.cassandra")
       .options(Map("table" -> "", "keyspace" -> "")).mode("append").save()
    

    • DSBulk .您可以将数据库中的数据卸载到磁盘上,然后再加载回去,但是通过提供带有 -m 选项的自定义映射,可以使用timestamp列而不是text列.有一系列有关DSBulk的博客文章,可以提供更多信息,示例: 1 , 2 , 3 4 5 6
      • DSBulk. You can unload data from your database onto the disk, and then load back, but use timestamp column instead of the text column by providing the custom mapping with -m option. There is a serie of blog posts about DSBulk, that could provide more information & examples: 1, 2, 3, 4, 5, 6
      • 这篇关于Cassandra脚本将日期更改为即时类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 23:25