本文介绍了Titan节点不上来的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个小的Titan 0.5.0 集群,有8个节点。每个节点在Rexster 2.5.0 和Cassandra中运行Titan。它们都配置相同。不幸的是,几乎所有的时间其中一个无法启动。

在大多数情况下,这是种子节点之一。

I have a small Titan 0.5.0 cluster with 8 nodes. Every node runs Titan in Rexster 2.5.0 and Cassandra. They all are configured the same. Unfortunately nearly all the time one of them does not manage to start.
In most cases this is one of the seed nodes.

使用 cassandra 作为存储后端我在Rexster / Titan日志中获得以下内容。

Using cassandra as storage backend I get the following in the Rexster/Titan log.

WARN  com.tinkerpop.rexster.config.GraphConfigurationContainer - Could
  not open global configuration com.thinkaurelius.titan.core.TitanException:
  Could not open global configuration
 at com.thinkaurelius.titan.diskstorage.Backend.
   getStandaloneGlobalConfiguration(Backend.java: 405)
...
Caused by: com.thinkaurelius.titan.diskstorage.TemporaryBackendException:
  Temporary failure in storage backend
 at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.
   AstyanaxStoreManager.ensureColumnFamilyExists(AstyanaxStoreManager.java:446)
...
Caused by: com.netflix.astyanax.connectionpool.exceptions.BadRequestException:
  BadRequestException: [host=192.168.0.10(192.168.0.10):9160, latency=496(496),
  attempts=1] InvalidRequestException(why:Cannot add already existing
  column family "system_properties" to keyspace "titan")
 at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(
   ThriftConverter.java:159)

Rexster无法启动,因此未加载图表。
然而,Cassandra节点Rexster连接失败似乎很好: nodetool 将节点列为环的一部分。如果我针对剩余的Rexster实例启动请求,一切似乎都起作用。

Rexster does fail to start and thus did not load the graph.However, the Cassandra node Rexster failed to connect to seems to be fine: nodetool lists the node as part of the ring. If I fire requests against the remaining Rexster instances everything seems to work.

我在启动节点之前擦除了所有数据。

I wiped all data before starting the nodes.

我切换到 cassandrathrift 导致类似的异常(与TimeoutException引起的PermanentBackendException引起的TitanException相同)。 Rexster中的存储超时为30秒。这可能太低,因为我现在开始所有节点同时,但不解释 cassandra 的问题。

I switched to cassandrathrift resulting in a similar exception (same TitanException caused by PermanentBackendException caused by TimeoutException). The storage timeout in Rexster is 30s. This may be too low since I start all nodes simultaneously at the moment, but does not explain the issues with cassandra.

这里有什么问题?

编辑

Titan。不必在启动时处理索引创建 - 这在我的情况下经常发生 - 我在Rexster扩展中创建了索引。我认为这个代码被调用了多次:当我同时启动多个节点,似乎有些人试图创建索引。

I was misusing Titan. To not have to deal with index creation on startup - which happens quite often in my case - I created the index in the Rexster extension. I think this code got invoked multiple times: When I started multiple nodes simultaneously it seems some of them tried to create the index.

问题:安全地创建索引?我为此创建了一个单独的线程:

Question: Is there any way the extension can create the indices safely? I created a separate thread for this: What are the methods to create indices?

我将存储超时增加到60秒,并在从代码中删除索引创建后重试该过程。我仍然同时启动所有节点。再次,一个Rexstitan节点(种子节点#2)无法启动。

I increased the storage timeout to 60s and retried the procedure after removing the index creation from code. I still startup all nodes simultaneously. Again one Rexstitan node (seed node #2) fails to start.

Cassandra日志确实包含异常

The Cassandra log indeed contains an exception

java.lang.IllegalArgumentException: Unknown keyspace/cf pair (titan.txlog)
    at org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:166)
    at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:326)
    at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
    at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
    at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

虽然一个种子节点上的雷克斯特似乎不在乎另一个Rexster实例无法以

which I can see in both seed nodes. While the Rexster on one seed node does not seem to care the other Rexster instance fails to start with

Caused by: com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=192.168.0.10(192.168.0.10):9160, latency=66(66), attempts=1]InvalidRequestException(why:Cannot add already existing column family "graphindex_lock_" to keyspace "titan")
    at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:159)
    at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
    at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
    at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
    at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:119)
    at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:338)
    at com.netflix.astyanax.thrift.ThriftClusterImpl.executeSchemaChangeOperation(ThriftClusterImpl.java:146)
    at com.netflix.astyanax.thrift.ThriftClusterImpl.internalCreateColumnFamily(ThriftClusterImpl.java:240)

听起来非常类似于之前产生的异常。

in rexstitan.log. Sounds quite similar to the exceptions raised before.

只是为了澄清:
失败我的意思是Rexster启动,可以查询,但无法加载Titan图表图表。

Just to clarify:With fail I mean that Rexster is started and can be queried but failed to load the Titan graph "graph".

也许我必须将大小减小到最小,以检查这是否与集群大小相关。

Maybe I have to reduce the size to a minimum to check if this is related to cluster size.

编辑#2

它与群集大小无关。它真的很烦人。有时它是上面的 BadRequestException ,有时它是一个 BadRequestException ,因为已经有一个键空间titan 。
或者是 IllegalArgumentException

It is not related to cluster size. And it's getting really annoying.Sometimes it is the BadRequestException above, sometimes it's a BadRequestException because there already is a keyspace "titan".Or it is an IllegalArgumentException:

2646 [main] WARN  com.tinkerpop.rexster.config.GraphConfigurationContainer -
  Database has already been initialized but not frozen
  java.lang.IllegalArgumentException: Database has already been initialized but not frozen
    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:93)
    at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1294)
    at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:93)
    at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:73)
    at com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration.configureGraphInstance(TitanGraphConfiguration.java:33)
    at com.tinkerpop.rexster.config.GraphConfigurationContainer.getGraphFromConfiguration(GraphConfigurationContainer.java:124)
    at com.tinkerpop.rexster.config.GraphConfigurationContainer.<init>(GraphConfigurationContainer.java:54)
    at com.tinkerpop.rexster.server.XmlRexsterApplication.reconfigure(XmlRexsterApplication.java:99)
    at com.tinkerpop.rexster.server.XmlRexsterApplication.<init>(XmlRexsterApplication.java:47)
    at com.tinkerpop.rexster.Application.<init>(Application.java:97)
    at com.tinkerpop.rexster.Application.main(Application.java:189)

是不可能一次启动多个节点, ?
这是我能想到的唯一原因,因为我可以得到任何异常,有时它可以正常工作。

Is it not possible to start multiple nodes at once, do they conflict?This is the only reason I can think of, because I can get any exception and sometimes it works fine.

推荐答案

问题是同时启动Titan节点。 (版本 0.5.0

一次启动的节点越多, BadRequestException

The problem is the simultaneous startup of the Titan nodes. (version 0.5.0)
The more nodes you startup at once, the more likely the BadRequestExceptions are, since all the nodes try to create the same keyspace/column families in the Cassandra cluster concurrently.

为了克服这个问题,你必须

To overcome this issue you have to


  1. 启动Cassandra(一次所有节点都很好)

  2. 启动单个Titan节点

  3. 打开此节点上的Rexster控制台,创建模式和索引

  4. 启动剩余的Titan节点

  1. start Cassandra (all nodes at once is fine)
  2. start a single Titan node
  3. open the Rexster console on this node, create the schema and indices
  4. start the remaining Titan nodes

这篇关于Titan节点不上来的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 08:00