问题描述
我试图找出哪一种最适合增量搜索-edge n gramm或完成提示.到目前为止,我在Internet上阅读的内容表明,对于完成建议器,数据处理是在创建索引时完成的,而对于边缘n gram,它是在查询时完成的,因此它比完成建议器的处理速度慢.但是,就在Elasticsearch-definitve指南的那一刻,我读到对于边缘n gram来说,它也是在索引时完成的.现在我真的很困惑,任何人都可以清除,但是边缘n gram在内部起作用.
I was trying to figure out which one is better for incremental search - edge n gramm or completion suggester. So far what I have read on Internet suggested that for completion suggester processing of data is done at the time of index creation whereas for edge n gram its done at query time, hence its slower than completion suggester. However just now in the Elasticsearch - The definitve guide, book I read that for edge n gram also its done at indexing time. Now I am really confused, can anyone please clear however edge n gram internally works.
谢谢
推荐答案
两者均在索引时间执行,以建立专用的数据结构:
Both act at index time, building dedicated data structures:
- N gram令牌生成器生成令牌:"hello world"变为"h","he","hel" ..."worl","world".使用通常的文本"(也称为字符串")映射类型.
- 完成建议程序生成一个图形:请参见 https://www.elastic.co/blog/you-complete-me .此时,有一个特殊的映射类型"completion".
- N gram tokenizer generates tokens: "hello world" becomes "h", "he", "hel"... "worl", "world". A usual "text" (aka "string") mapping type is used.
- completion suggester generates a graph: see https://www.elastic.co/blog/you-complete-me . At this point, there is a special mapping type "completion".
在搜索时,建议者便宜一些:
At search time, suggester are less expensive:
- N克令牌生成器:
- 必须分析
- 键入的文本,搜索结果词:搜索的"Hello Wor"应分析为"hello" +"wor",然后搜索这两个词.
- 但是,应将N gram标记器从分析中删除(在搜索和索引编制之间使用不同的分析器):搜索"Henry"将被分析为"h","he","hen","henr" ...会返回"hello",因为它们共享相同的前缀"he".
在两种情况下,您都可以使用自定义分析链(法语,德语,soundex ...):
In both cases, you can use custom analysis chains (french, german, soundex...):
- N克:
- 写时间:自定义分析+边缘ngram +键入文本"
- 阅读时间:自定义分析+截断+搜索API
- 写时间:自定义分析+输入完成"
- 阅读时间:自定义分析+建议API
这篇关于何时在Elasticsearch中为Edge N Gram创建分析链的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!