本文介绍了动态更改 elasticsearch 同义词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以将elasticsearch的同义词存储在索引中?或者是否可以从像 couchdb 这样的数据库中获取同义词列表?我想通过 REST-API 向 elasticsearch 动态添加同义词.

Is it possible to store the synonyms for elasticsearch in the index? Or is it possible to get the synonym list from a database like couchdb?I'd like to add synonyms dynamically to elasticsearch via the REST-API.

推荐答案

处理同义词时有两种方法:

There are two approaches when working with synonyms :

  • 在索引时扩展它们,
  • 在查询时扩展它们.

不建议在查询时扩展同义词,因为它会引发以下问题:

Expanding synonyms at query time is not recommended since it raises issues with :

  • 评分,因为同义词有不同的文档频率,
  • 多标记同义词,因为查询解析器在空格上拆分.

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters 上了解更多详情#solr.SynonymFilterFactory(在 Solr wiki 上,但也与 ElasticSearch 相关).

More details on this at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory (on Solr wiki, but relevant for ElasticSearch too).

所以推荐的方法是在索引时扩展同义词.在您的情况下,如果同义词列表是动态管理的,则意味着您应该重新索引每个包含同义词列表已更新的术语的文档,以便在更新前和更新后分析的文档之间的评分保持一致.我并不是说这是不可能的,但它需要一些工作,并且可能会引起索引中出现频率很高的同义词的性能问题.

So the recommended approach is to expand synonyms at indexing time. In your case, if the synonym list is managed dynamically, it means that you should re-index every document which contains a term whose synonym list has been updated so that scoring remains consistent between documents analyzed pre and post update. I'm not saying that it is not possible but it requires some work and will probably raise performance issues with synonyms which have a high frequency in your index.

这篇关于动态更改 elasticsearch 同义词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-26 05:50