问题描述
我正在使用apache spark和apache cassandra进行数据分析,并且正在努力地将timeuuid字段插入cassandra中.
I am toying with apache spark and apache cassandra for data analytics and i am struggling with inserting back into cassandra with timeuuid fields.
我有下表
CREATE TABLE leech_seed_report.daily_sessions (
id timeuuid PRIMARY KEY,
app int,
count int,
date bigint,
offline boolean,
vendor text,
version text
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX daily_sessions_app_idx ON leech_seed_report.daily_sessions (app);
CREATE INDEX daily_sessions_date_idx ON leech_seed_report.daily_sessions (date);
CREATE INDEX daily_sessions_offline_idx ON leech_seed_report.daily_sessions (offline);
CREATE INDEX daily_sessions_vendor_idx ON leech_seed_report.daily_sessions (vendor);
CREATE INDEX daily_sessions_version_idx ON leech_seed_report.daily_sessions (version);
我正在使用
rows.saveToCassandra("leech_seed_report", "daily_sessions", SomeColumns("id", "date", "app", "vendor", "version", "offline", "count"))
我的行由以下格式的元组组成
and my rows consist of tuples of the format
([timmuuid_will_be_here], BigInt, Int, String, String, Boolean, Int)
我一直尝试在没有timeuuid字段的情况下插入同一张表,并且一切正常,但是我一生都无法解决如何为每一行创建一个timeuuid
i have played around with inserting into the same table without the timeuuid field and it all works fine but i cant for the life of me work out how to create a timeuuid for each row
任何帮助将不胜感激,我是火花,cassandra和scala的新手,感觉就像是将我的头撞在砖墙上
Any help would be greatly appreciated, im new to spark, cassandra and scala and feel like im banging my head against a brick wall
谢谢马特.
推荐答案
导入com.datastax.driver.core.utils.UUIDs
并调用UUIDs.timeBased()
生成一个timeuuid.
Import com.datastax.driver.core.utils.UUIDs
and call UUIDs.timeBased()
to generate a timeuuid.
在您的情况下:
rows.saveToCassandra("leech_seed_report", "daily_sessions", SomeColumns(UUIDS.timeBased(),
"date", "app", "vendor", "version", "offline", "count"))
这篇关于使用Apache Spark为Cassandra插件创建Timeuuid的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!