本文介绍了SOLR 复制不断从 master 下载整个索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 2 个从具有 17GB 索引的主服务器复制.我将两个从站同步到此,之后我将轮询间隔设置为 60 秒.

I have 2 slaves replicating from a master that has a 17GB index. I synced both slaves to this, AFTER which I set the poll interval to 60 seconds.

其中一个从站尝试下载整个 17GB 索引,即使其中只有一小部分发生了变化.另一个不这样做 - 它能够在没有这种蛮力同步的情况下获得最新的索引.冗余下载导致我超出了磁盘空间配额,因为下载 17GB 需要 60 多秒,并且 solr 开始第二次同步到另一个临时目录.

One of the slaves tries to download the entire 17GB index even if only a tiny portion of it has changed. The other does not do this - it is able to get the latest index without this brute force sync. The redundant downloading causes me to exceed my disk space quota because it takes more than 60 seconds to download 17GB and solr kicks off a 2nd sync into yet another temporary directory.

有没有人有关于如何调试这个的任何提示?

Does anyone have any tips on how to debug this?

推荐答案

我只能看到三个可能的原因:

I can only see three possible causes of this:

  1. 在时间间隔内触发优化,导致所有基础段被合并.请参阅:优化性能
  2. 您正在运行一个非常高的合并因子,导致您的索引与每次提交合并.请参阅:合并因子
  3. 您正在运行复合文件.请参阅配置:false</useCompoundFile> 这也会导致每次提交时进行段合并.
  1. An optimize is triggered during the time interval causing all of the underlying segmets to be merged. See: Optimize performance
  2. You're running with an exessivly high merge factor causing your index to merge with every commit. See: Merge factor
  3. You're running with compund files. See the config: <useCompoundFile>false</useCompoundFile> This also causes a segment merge at every commit.

我能想到如何调试的唯一方法是使用 Solr 手动进行复制复制 HTTP API.

The only way I can think of how to debug this is by manually makingt the replication with the Solr Replication HTTP API.

禁用复制并使用以下命令观察 Solr master 中文件的更新方式:http://host:port/solr/replication?command=indexversion

Disable the replication and watch how files are updated in the Solr master with the command:http://host:port/solr/replication?command=indexversion

紧随其后:

http://host:port/solr/replication?command=indexversion

希望这有帮助!

这篇关于SOLR 复制不断从 master 下载整个索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-24 13:26