问题描述
我正在使用 Elasticsearch 内置的简单分析器 https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-simple-analyzer.html,使用小写分词器.和文本 apple 8 IS Awesome 被标记为以下格式.
I am using Elasticsearch in-built Simple analyzer https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-simple-analyzer.html, which uses Lower Case Tokenizer. and text apple 8 IS Awesome is tokenized in the below format.
"apple",
"is",
"awesome"
您可以清楚地看到,它没有对数字 8
进行标记,因此现在如果我只使用 8
进行搜索,我的消息将不会出现在搜索中.
You can clearly see, that it misses tokenizing the number 8
, hence now if I just search with 8
, my message will not appear in search.
我浏览了 ES 提供的所有可用分析器,但找不到任何符合我要求的合适分析器.
I went through all the available analyzer available with ES but couldn't find any suitable analyzer which matches my requirement.
如何使用 ES 的自定义或内置分析器用数字标记所有单词?
How can I tokenize all the words with a number using a custom or in-built analyzer of ES ?
推荐答案
您的问题是关于简单的分析器,但您提到了一个非常古老的文档链接.尝试https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simple-analyzer.html
Your question is about the simple analyzer, but you mention a very old link to documentation. Tryhttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simple-analyzer.html
就像 Val 告诉你的那样,你可能正在寻找 标准分析仪.如果您想查看差异,请尝试分析API:
Like Val told you, you probably looking for the standard analyser.If you want to see the difference try the analysis api:
- http://localhost:9200/_analyze?analyzer=simple&text=apple%208%20IS%20Awesome
- http://localhost:9200/_analyze?analyzer=standard&text=apple%208%20IS%20Awesome
这篇关于ES 分析器,它也标记数字、数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!