

本文介绍了Solr DIH -- 如何处理已删除的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!


我正在对我的 web 应用程序进行 Solr 驱动的搜索,我认为最好使用 DataImportHandler 通过数据库处理与应用程序的同步.我喜欢只检查 last_updated_date 字段的优雅.好东西.但是,我不知道如何使用这种方法处理删除文档.在我看来,我有 2 个选择.当文档被删除时,我可以从客户端向 Solr 发送显式消息,或者我可以添加一个已删除"标志并将对象保留在数据库中,以便 Solr 会注意到文档已更改并且现在已被删除"."我可以添加一个查询过滤器,它会忽略带有已删除标志的结果,但是将所有已删除的文档包含在 Lucene 索引中似乎效率低下.其他人在做什么?

I'm playing around with a Solr-powered search for my webapp, and I figured it'd be best to use the DataImportHandler to handle syncing with the app via the database. I like the elegance of just checking the last_updated_date field. Good stuff. However, I don't know how to handle deleting documents with this approach. The way I see it, I've got 2 choices. I could either send an explicit message to Solr from the client when a document is deleted, or I could add a "deleted" flag and leave the object in the database, so that Solr will notice that the document has changed and is now "deleted." I could add a query filter that would disregard results with the deleted flag, but it seems inefficient to include all the deleted documents in the Lucene index. What do other folks do?



  • 使用 DIH 特殊命令 $deleteDocById 或 $deleteDocByQuery(需要 Solr 1.4+)
  • 使用DIH的clean参数在导入前删除整个索引.莉>
  • 使用 preImportDeleteQuery 定义导入前要清理的内容.(需要 Solr 1.4+)
  • 使用数据库触发器而不是 DIH 来管理更新索引.
  • 如果您使用某种 ORM,请使用其拦截功能而不是 DIH.例如,您可以使用 休眠事件 在更新、插入或删除时更新索引.
  • Use DIH special commands $deleteDocById or $deleteDocByQuery (requires Solr 1.4+)
  • Use the clean parameter of DIH to delete the whole index before importing.
  • Use preImportDeleteQuery to define what's going to be cleaned up before importing. (requires Solr 1.4+)
  • Use database triggers instead of DIH to manage updating the index.
  • If you're using some sort of ORM use its interception capabilities instead of DIH. For example you can use hibernate events to update the index on update, insert or delete.

这篇关于Solr DIH -- 如何处理已删除的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 10:20