问题描述
为Azure搜索定义自定义分析器时,可以选择从此列表.我正在尝试支持搜索前缀和中缀.例如:如果一个字段包含名称:123 456,则我希望可搜索的术语包含:
When defining a custom analyzer for Azure Search there is an option of defining a token filter from this list.I am trying to support search of both prefix and infix.For example: if a field contains the name: 123 456, I want the searchable terms to contain:
1
12
123
23
3
4
45
456
56
6
使用似乎可以解决问题的EdgeNGramTokenFilterV2时,可以选择定义"side"属性,但仅支持"front"和"back",而不能同时支持."front"(默认)值将生成以下列表:
When using the EdgeNGramTokenFilterV2 which seems to do the trick, there is an option of defining a "side" property, but only "front" and "back" are supported, not both.the "front" (default) value generates this list:
1
12
123
4
45
456
并返回生成:
123
23
3
456
56
6
我尝试使用两个令牌两个EdgeNGramTokenFilterV2,但这通过组合两个过滤器来创建术语,例如:"2"或"5":
I tried using two token two EdgeNGramTokenFilterV2s, but this creates terms from combining the two filters such as: "2" or "5":
1
12
123
23
3
4
45
456
56
6
2 // Unwanted
5 // Unwanted
我还尝试使用反向"令牌,但这会反转所有内容,结果仍然是错误的.
I also tried using a "reverse" token, but this reverses everything and the results are still wrong.
我仅使用一个搜索字段(名称"),希望它保持这种状态. (可以选择在其他分析器中使用名为"name_reverse"的其他字段,但这非常低效,并且在将搜索引擎连接到数据源时会引起很多麻烦.
I am using only one search field ("Name") and would prefer it to stay like this. (Thought of the option of using a different field named "name_reverse" with a different analyzer, but this is very inefficient and will cause a lot of headache when connecting the search engine to the data source.
为便于参考,这是当前的索引创建请求:
For easier reference, this is the current index creation request:
{
"name": "testindexboth",
"fields": [
{"name": "id", "type": "Edm.String", "key": true },
{"name": "Name", "type": "Edm.String", "searchable": true, "analyzer": "myAnalyzer"}
],
"myAnalyzer": [
{
"name": "myAnalyzer",
"@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
"tokenizer": "standard_v2",
"tokenFilters":["front_filter", "back_filter"]
}],
"tokenFilters":[
{
"name":"front_filter",
"@odata.type":"#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
"maxGram":15,
"side": "front"
},
{
"name":"back_filter",
"@odata.type":"#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
"maxGram":15,
"side": "back"
}
]
}
是否可以将两者结合在一起而又不会使结果混乱呢?
Is there an option of combining both, without getting them scramble up the results?
推荐答案
向索引添加两个字段,并使用两个不同的自定义分析器:一个用于前缀,一个用于后缀.查询时,同时查询两个字段.
Add two fields to your index, with two different custom analyzers: one for prefix, one for suffix. When querying, query against both fields.
这篇关于创建同时支持Azure搜索双方的EdgeNGram分析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!