问题描述
休斯顿,我们有问题.
尝试在现有Cassandra(v2.1.3)键空间上使用 cqlsh
创建新表会导致:
Trying to create a new table with cqlsh
on an existing Cassandra (v2.1.3) keyspace results in:
ServerError:
<ErrorMessage code=0000 [Server error] message="java.lang.RuntimeException:
java.util.concurrent.ExecutionException:
java.lang.RuntimeException:
org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found e8c03790-c952-11e4-a753-5981ea73cd7c; expected e8b14370-c952-11e4-a844-8f10bfb9c386)">
第一次尝试创建后,再次尝试将导致:
After the first create attempt, trying once more will result in:
AlreadyExists:表'ks.metrics'已经存在
AlreadyExists: Table 'ks.metrics' already exists
但是检索键空间 desc表;
的现有表的列表将不会报告新表.
But retrieving the list of existing tables for the keyspace desc tables;
will not report the new table.
该问题似乎与 Cassandra-8387 有关,但只有一个客户端尝试创建表: cqlsh
The issue seems related to Cassandra-8387 except that there's only one client trying to create the table: cqlsh
我们确实有很多Spark作业,这些作业将在启动时创建键空间和表,并可能并行执行.这样会使键空间损坏吗?
We do have a bunch of Spark jobs that will create the keyspaces and tables at startup, potentially doing this in parallel. Would this render the keyspace corrupt?
创建新的键空间并向其中添加表即可.
Creating a new keyspace and adding a table to it works as expected.
有什么想法吗?
更新
找到一种解决方法:对键空间进行修复,然后这些表将出现( desc表
)并且也可以使用.
Found a workaround: issue a repair on the keyspace and the tables will appear (desc tables
) and are also functional.
推荐答案
简短回答: 他们有一个竞争条件,他们认为他们在 1.1.8 ...
Short answer: They have a race condition, which they think they resolved in 1.1.8...
详细答案:
我一直在我的一个集群上收到该错误.我的测试计算机的硬盘驱动器的速度确实很慢,并且当我在两台单独的计算机上有4个节点时,创建一个或两个表足以得到错误.
I get that error all the time on one of my clusters. I have test machines that have really slow hard drives and creating one or two tables is enough to get the error when I have 4 nodes on two separate computers.
下面,我从我的Cassandra 3.7安装中获得了堆栈跟踪的副本.尽管您的版本是2.1.3,但我会对这部分代码发生如此大的变化感到惊讶.
Below I have a copy of the stack trace from my Cassandra 3.7 installation. Although your version was 2.1.3, I would be surprised that this part of the code changed that much.
如我们所见,该异常发生在 validateCompatibility()
函数中.这就要求新的和旧的MetaData版本必须具有相同的条件:
As we can see, the exception happens in the validateCompatibility()
function. This requires that the new and old versions of the MetaData have these equal:
- ksName(键空间名称)
- cfName(列家族名称)
- cfId(列家庭UUID)
- 标志(isSuper,isCounter,isDense,isCompound)
- 比较器(键排序比较器)
如果这些值中的任何一个在旧的和新的元数据之间都不匹配,那么该过程将引发异常.在我们的例子中, cfId
的值是不同的.
If any one of these values do not match between the old and new meta data, then the process raises an exception. In our case, the cfId
values are different.
在堆栈中,我们有 apply()
,它立即调用 validateCompatibility()
.
Going up the stack, we have the apply()
which calls validateCompatibility()
immediately.
接下来,我们有 updateTable()
.同样,它几乎立即调用 apply()
.首先,它调用 getCFMetaData()
来检索将要与新数据进行比较的当前列族数据(旧").
Next we have updateTable()
. Similarly, it calls apply()
nearly immediately. First it calls the getCFMetaData()
to retrieve the current column family data ("old") that is going to be compared against the new data.
接下来,我们看到 updateKeyspace()
.该函数会计算 diff
来知道发生了什么变化.然后将其保存在每种类型的数据中.表格排在第二位之后...
Next we see updateKeyspace()
. That function calculates a diff
to know what changed. Then it saves that in each type of data. Table is 2nd after Type...
在它们具有 mergeSchema()
之前,它们可以计算键空间级别上的更改.然后,它将删除已删除的键空间,并为已更新的键空间(以及新的键空间)生成新的键空间.最后,它们循环遍历新的键空间,并为每个键空间调用 updateKeyspace()
.
Before that they have the mergeSchema()
which calculates what changed at the Keyspace level. It then drops keyspaces that were deleted and generate new keyspaces for those that were updated (and for new keyspaces). Finally, they loop over the new keyspaces calling updateKeyspace()
for each one of them.
在堆栈中的下一步,我们看到一个有趣的函数: mergeSchemaAndAnnounceVersion()
.密钥空间在内存和磁盘上更新后,该版本将更新版本.模式的版本包含不兼容的 cfID
,因此会生成异常. Announce
部分是向其他节点发送八卦消息,说明该节点现在知道某个方案的新版本.
Next in the stack we see an interesting function: mergeSchemaAndAnnounceVersion()
. This one will update the version once the keyspaces were updated in memory and on disk. The version of the schema includes that cfID
that is not compatible and thus generates the exception. The Announce
part is to send a gossip message to the other nodes about the fact that this node now knows of the new version of a certain schema.
接下来,我们看到一个叫做 MigrationTask
的东西.这是用于在Cassandra节点之间迁移更改的消息.消息有效负载是突变的集合(这些突变由 mergeSchema()
函数处理.)
Next we see something called MigrationTask
. This is the message used to migrate changes between Cassandra nodes. The message payload is a collection of mutations (those handled by the mergeSchema()
function.)
堆栈的其余部分仅显示 run()
函数,这些函数是用于处理消息的各种类型的函数.
The rest of the stack just shows run()
functions that are various types of functions used to handle messages.
就我而言,对我来说问题稍后得到解决,一切都很好.我无事可做,架构最终无法同步.如预期的那样.但是,这阻止了我一次性创建所有表.因此,我认为这是因为迁移消息未按预期顺序到达.必须有一个超时,可以通过重新发送事件来处理并产生混淆.
In my case, for me the problem gets resolved a little later and all is well. I have nothing to do for the schema to finally get in sync. as expected. However, it prevents me from creating all my tables in one go. So, my take looking at this is that the migration messages do not arrive in the expected order. There must be a timeout which is handled by resending the event and that generates the mix-up.
因此,让我们首先看一下发送消息的代码,您会在MigrationManager中看到该代码.在这里,我们有一个 MIGRATION_DELAY_IN_MS
参数与一个旧问题的链接,架构推/拉竞赛,这是为了避免出现竞赛情况.好吧...你去了.因此,他们意识到可能存在竞争状况,并且为了避免这种情况,他们在此增加了一些延迟.该修复程序的一部分包括版本检查.如果版本已经相等,请完全避免进行更新(即忽略该八卦).
So, lets look at the code sending the message in the first place, you see that one in the MigrationManager. Here we have a MIGRATION_DELAY_IN_MS
parameter in link with an old issue, Schema push/pull race, which was to avoid a race condition. Well... there you go. So they are aware that there is a possible race condition and to try to avoid it, they added a little delay there. One part of that fix includes a version check. If the versions are already equal, avoid the update altogether (i.e. ignore that gossip).
if (Schema.instance.getVersion().equals(currentVersion))
{
logger.debug("not submitting migration task for {} because our versions match", endpoint);
return;
}
我们正在谈论的延迟是一分钟:
The delay we are talking about is one minute:
public static final int MIGRATION_DELAY_IN_MS = 60000;
人们会认为一整分钟就足够了,但是不知何故我仍然总是会收到错误消息.
One would think that one whole minute would suffice, but somehow I still get the error all the time.
事实是,他们的代码不希望一个接一个的发生多次更改,包括像我这样的大延迟.因此,如果我要创建一个表,然后执行其他操作,那很好.另一方面,当我想在那些速度较慢的计算机上连续创建20个表时,来自先前架构更改的闲聊消息会延迟到来(即在新的CREATE TABLE命令到达该节点之后)..我猜最糟糕的部分是这是一个虚假的错误(即告诉我八卦是后来的,而不是我的模式无效并且八卦消息中的模式是旧的.)
The fact is that their code does not expect multiple changes happening one after the other including large delays like I have. So if I were to create one table, and then do other things, I'd be just fine. On the other hand, when I want to create 20 tables in a row on those slow machines, the gossiping message from a previous schema change arrives late (i.e. after the new CREATE TABLE command arrived to that node.) That's when I get that error. The worst part, I guess, is that it is a spurious error (i.e. it is telling me that the gossip was later, and not that my schema is invalid and the schema in the gossip message is an old one.)
org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found 122a2d20-9e13-11e6-b830-55bace508971; expected 1213bef0-9e
at org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:790) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:750) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.config.Schema.updateTable(Schema.java:661) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1350) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1306) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1256) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:92) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) [apache-cassandra-3.9.jar:3.9]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_111]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
这篇关于在现有键空间上使用cqlsh创建新表:列族ID不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!