本文介绍了在Cassandra日志中执行LOGGED BATCH警告的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们做批处理的Java应用程序在表1中插入
,该表模式类似于..

Our Java Application doing a batch inserts on 1 of the table,That table schema is something like..

CREATE TABLE "My_KeySpace"."my_table" (
    key text,
    column1 varint,
    column2 bigint,
    column3 text,
    column4 boolean,
    value blob,
    PRIMARY KEY (key, column1, column2, column3, column4)
) WITH CLUSTERING ORDER BY ( column1 DESC, column2 DESC, column3 ASC, column4 ASC )
AND COMPACT STORAGE
AND bloom_filter_fp_chance = 0.1
AND comment = ''
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.1
AND speculative_retry = 'NONE'
AND caching = {
    'keys' : 'ALL',
    'rows_per_partition' : 'NONE'
}
AND compression = {
    'chunk_length_in_kb' : 64,
    'class' : 'LZ4Compressor',
    'enabled' : true
}
AND compaction = {
    'class' : 'LeveledCompactionStrategy',
    'sstable_size_in_mb' : 5
};

gc_grace_seconds = 0在上述架构中。因此,我收到以下警告:

2019-02-05 01:59:53.087 WARN   [SharedPool-Worker-5 - org.apache.cassandra.cql3.statements.BatchStatement:97] Executing a LOGGED BATCH on table [My_KeySpace.my_table], configured with a gc_grace_seconds of 0. The gc_grace_seconds is used to TTL batchlog entries, so setting gc_grace_seconds too low on tables involved in an atomic batch might cause batchlog entries to expire before being replayed.

我已经看到了Cassandra代码,此警告的出现有明显的原因:

I have seen Cassandra code, this warning is there for obvious reasons at: this line

任何不更改应用程序批处理代码的解决方案?
我应该增加gc_grace_seconds吗?

Any solution without changing batch code in application??Should I increase gc_grace_seconds?

推荐答案

在Cassandra中,批次不是优化插入数据库的方法-它们通常通常用于如果您要使用批处理将其插入多个分区,甚至会得到。

In Cassandra, batches aren't the way to optimize inserts into database - they are usually used mostly for coordinating writing into multiple tables, etc. If you're using the batches for insertion into multiple partitions, you're even get worse performance.

使用异步命令执行(通过)可以获得更好的插入吞吐量。 executeAsync ),和/或使用批处理,但仅适用于针对同一分区的插入。

The better throughput for inserts you can get from using asynchronous commands execution (via executeAsync), and/or by using batches but only for inserts that are targeting the the same partition.

这篇关于在Cassandra日志中执行LOGGED BATCH警告的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-23 09:35
查看更多