我有一个正常工作的Lucene 4.3.1集群,并且要添加一个自动热备份过程,类似于Manning的“Lucene in Action” 书中所描述的内容,以及那里的一些博客文章。但是,本书是基于Lucene 2.3的,而API在4.3.1中已稍作更改。这本书说要实例化IndexWriter像这样:

IndexDeletionPolicy policy = new KeepOnlyLastCommitDeletionPolicy();
SnapshotDeletionPolicy snapshotter = new SnapshotDeletionPolicy(policy);
IndexWriter writer = new IndexWriter(dir, analyzer, snapshotter,
                                 IndexWriter.MaxFieldLength.UNLIMITED);

进行备份时:
try {
   IndexCommit commit = snapshotter.snapshot();
   Collection<String> fileNames = commit.getFileNames();
   /*<iterate over & copy files from fileNames>*/
} finally {
   snapshotter.release();
}

但是,在某些情况下,Lucene 4.x似乎改变了这一点。 SnapshotDeletionPolicy现在已用IndexWriterConfig配置,并且在创建IndexWriter时已传递。这是我到目前为止的代码:
public Indexer(Directory indexDir, PrintStream printStream) throws IOException {
    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, new Analyzer());
    snapshotter = new SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy());
    writerConfig.setIndexDeletionPolicy(snapshotter);
    indexWriter = new IndexWriter(indexDir, writerConfig);
}

而且,开始备份时,您不仅可以执行snapshotter.snapshot()。现在,您必须指定一个任意的commitIdentifier id,并在完成快照后使用它。
SnapshotDeletionPolicy snapshotter = indexer.getSnapshotter();
String commitIdentifier = generateCommitIdentifier();
try {
    IndexCommit commit = snapshotter.snapshot(commitIdentifier);
    for (String fileName : commit.getFileNames()) {
        backupFile(fileName);
    }
} catch (Exception e) {
    logger.error("Exception", e);
} finally {
    snapshotter.release(commitIdentifier);
    indexer.deleteUnusedFiles();
}

但是,这似乎不起作用。无论是否已对文档建立索引,无论我是否提交,我对snapshotter.snapshot(commitIdentifier)的调用始终会抛出一个IllegalStateException,即No index commit to snapshot。查看代码,SnapshotDeletionPolicy似乎认为没有提交,即使我每5秒左右提交一次到磁盘。我已经验证过,并且一直在编写文档并将其提交给索引,但是snapshotter始终认为提交为零。

谁能告诉我我可能做错了什么?让我知道是否需要发布更多详细信息。

最佳答案

我将相同的问题发布到Lucene Java用户邮件列表中,几乎立即得到了答案。问题在于您最初用于配置IndexWriter的SnapshotDeletionPolicy与IndexWriter使用的快照策略不同。在构造过程中,IndexWriter实际上会克隆您传入的SnapshotDeletionPolicy,因此上面的第一段代码应如下所示:

public Indexer(Directory indexDir, PrintStream printStream) throws IOException {
    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, new Analyzer());
    writerConfig.setIndexDeletionPolicy(new SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy()));
    indexWriter = new IndexWriter(indexDir, writerConfig);
    snapshotter = (SnapshotDeletionPolicy) indexWriter.getConfig().getIndexDeletionPolicy();
}

注意最后一行,您将从IndexWriter配置中将快照程序设置为IndexDeletionPolicy。那是关键。之后,原始问题中详述的第二个代码块可以完美运行。

作为参考,我从Apache Lucene邮件列表中获得了here's the answer

10-08 15:58