ainFailedException使用Solr更新Lucene

ainFailedException使用Solr更新Lucene

本文介绍了LockObtainFailedException使用Solr更新Lucene搜索索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Google上搜索了很多.这些问题多数是由JVM崩溃后留下的锁引起的.这不是我的情况.

I've googled this a lot. Most of these issues are caused by a lock being left around after a JVM crash. This is not my case.

我有一个包含多个读者和作家的索引.我正在尝试进行质量索引更新(删除并添加-这就是Lucene进行更新的方式).我正在使用solr的嵌入式服务器(org.apache.solr.client.solrj.embedded.EmbeddedSolrServer).其他作者正在使用远程非流式服务器(org.apache.solr.client.solrj.impl.CommonsHttpSolrServer).

I have an index with multiple readers and writers. I'm am trying to do a mass index update (delete and add -- that's how lucene does updates). I'm using solr's embedded server (org.apache.solr.client.solrj.embedded.EmbeddedSolrServer). Other writers are using the remote, non-streaming server (org.apache.solr.client.solrj.impl.CommonsHttpSolrServer).

我启动了此大规模更新,运行了一段时间后,死于一个

I kick off this mass update, it runs fine for a while, then dies with a

我已经在solrconfig.xml中调整了我的锁定超时时间

I've adjusted my lock timeouts in solrconfig.xml

<writeLockTimeout>20000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>

我将开始阅读lucene代码以解决此问题.任何帮助,所以我不必这样做会很好!

I'm about to start reading the lucene code to figure this out. Any help so I don't have to do this would be great!

我所有的更新都通过以下代码(Scala):

All my updates go through the following code (Scala):

val req = new UpdateRequest
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, false, false)
req.add(docs)

val rsp = req.process(solrServer)

solrServer是org.apache.solr.client.solrj.impl.CommonsHttpSolrServer,org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer或org.apache.solr.client.solrj.embedded的实例.EmbeddedSolrServer.

solrServer is an instance of org.apache.solr.client.solrj.impl.CommonsHttpSolrServer, org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer, or org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.

另一个我停止使用EmbeddedSolrServer,现在可以使用了.我有两个单独的过程来更新solr搜索索引:

ANOTHERI stopped using EmbeddedSolrServer and it works now. I have two separate processes that update the solr search index:

1)Servlet2)命令行工具

1) Servlet2) Command line tool

命令行工具正在使用EmbeddedSolrServer,它最终会因LockObtainFailedException而崩溃.当我开始使用StreamingUpdateSolrServer时,问题就消失了.

The command line tool was using the EmbeddedSolrServer and it would eventually crash with the LockObtainFailedException. When I started using StreamingUpdateSolrServer, the problems went away.

我仍然有点困惑,EmbeddedSolrServer可以正常工作.有人可以解释一下.我以为它可以与Servlet进程配合使用,并且他们会在另一个正在编写时等待.

I'm still a little confused that the EmbeddedSolrServer would work at all. Can someone explain this. I thought that it would play nice with the Servlet process and they would wait while the other is writing.

推荐答案

我假设您正在执行以下操作:

I'm assuming that you're doing something like:

writer1.writeSomeStuff();
writer2.writeSomeStuff();  // this one doesn't write

之所以不起作用,是因为编写器保持打开状态,除非您将其关闭.因此,即使完成写操作后,writer1也会写并保持锁住状态. (一旦写作者获得了锁,它就不会被释放直到被销毁.)writer2无法获得该锁,因为writer1仍然持有该锁,因此它会抛出一个LockObtainFailedException.

The reason this won't work is because the writer stays open unless you close it. So writer1 writes and holds on to the lock, even after it's done writing. (Once a writer gets a lock, it never releases until it's destroyed.) writer2 can't get the lock, since writer1 is still holding onto it, so it throws a LockObtainFailedException.

如果要使用两个编写器,则需要执行以下操作:

If you want to use two writers, you'd need to do something like:

writer1.writeSomeStuff();
writer1.close();
writer2.open();
writer2.writeSomeStuff();
writer2.close();

由于一次只能打开一个作家,因此这几乎抵消了使用多个作家所带来的任何好处. (实际上,一直打开和关闭它们会更糟,因为您将不断付出热身罚金.)

Since you can only have one writer open at a time, this pretty much negates any benefit you would get from using multiple writers. (It's actually much worse to open and close them all the time since you'll be constantly paying a warmup penalty.)

因此,我怀疑您的基本问题的答案是:不要使用多个编写器.使用具有多个线程访问它的单个编写器(IndexWriter是线程安全的).如果要通过REST或其他HTTP API连接到Solr,则单个Solr编写器应能够处理许多请求.

So the answer to what I suspect is your underlying question is: don't use multiple writers. Use a single writer with multiple threads accessing it (IndexWriter is thread safe). If you're connecting to Solr via REST or some other HTTP API, a single Solr writer should be able to handle many requests.

我不确定您的用例是什么,但是另一个可能的答案是查看 Solr的建议用于管理多个索引.尤其值得关注的是热交换内核的功能.

I'm not sure what your use case is, but another possible answer is to see Solr's Recommendations for managing multiple indices. Particularly the ability to hot-swap cores might be of interest.

这篇关于LockObtainFailedException使用Solr更新Lucene搜索索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:06