问题描述
这是我在elasticSearch上的字段:
Here is my field on elasticSearch :
"keywordName": {
"type": "text",
"analyzer": "custom_stop"
}
这是我的分析器:
$ b
Here is my analyzer :
"custom_stop": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"my_stop",
"my_snow",
"asciifolding"
]
}
这里是我的过滤器:
"my_stop": {
"type": "stop",
"stopwords": "_french_"
},
"my_snow" : {
"type" : "snowball",
"language" : "French"
}
这是我的文件我的索引(在我唯一的字段中:keywordName):
Here are my documents my index (in my only field : keywordName) :
canne a peche,canne,canne a peche telescopique ,iphone 8,iphone 8号,iphone 8盖,iphone 8充电器,iphone 8新
"canne a peche", "canne", "canne a peche telescopique", "iphone 8", "iphone 8 case", "iphone 8 cover", "iphone 8 charger", "iphone 8 new"
当我搜索canne时,它给了我 文件,这是我想要的:
When I search for "canne", it gives me the "canne" document, which is what I want :
GET ads/_search
{
"query": {
"match": {
"keywordName": {
"query": "canne",
"operator": "and"
}
}
},
"size": 1
}
当我搜索canneàpêche时,它给了我canne a peche,这也可以。对于戛纳àPêche - >canne a peche - > OK。
When I search for "canne à pêche", it gives me "canne a peche", which is OK, too. Same for "Cannes à Pêche" -> "canne a peche" -> OK.
这是棘手的一部分:当我搜索iphone 8时,它给出我iphone 8盖而不是iphone 8。如果我更改大小,我设置5(因为它返回包含iphone 8的5个结果)。我看到iphone 8是得分的第四个结果。第一个是iphone 8盖,然后iphone 8,然后iphone 8新,最后iphone 8...
Here is the tricky part : when I search for "iphone 8", it gives me "iphone 8 cover" instead of "iphone 8". If I change the size, I set 5 (as it returns the 5 results containing "iphone 8"). I see that "iphone 8" is the 4th result in term of score. The first is "iphone 8 cover" then "iphone 8 case" then "iphone 8 new" and finally "iphone 8" ...
这是查询:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1.4009607,
"hits": [
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 cover",
"_score": 1.4009607,
"_source": {
"keywordName": "iphone 8 cover"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 case",
"_score": 1.4009607,
"_source": {
"keywordName": "iphone 8 case"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 new",
"_score": 0.70293105,
"_source": {
"keywordName": "iphone 8 new"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8",
"_score": 0.5804671,
"_source": {
"keywordName": "iphone 8"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 charge",
"_score": 0.46705723,
"_source": {
"keywordName": "iphone 8 charge"
}
}
]
}
}
我如何保持关于canne a peche(口音,大写字母,复数词)的灵活性,但也告诉他如果有完全匹配( iphone 8=iphone 8),给我确切的keywordName?
How can I keep the flexibility concerning the keyword "canne a peche" (accents, capital letters, plural terms) but also tell him that if there is an exact match ("iphone 8" = "iphone 8"), give me the exact keywordName ?
推荐答案
像这样:
"keywordName": {
"type": "text",
"analyzer": "custom_stop",
"fields": {
"raw": {
"type": "keyword"
}
}
}
查询:
{
"query": {
"bool": {
"should": [
{
"match": {
"keywordName": {
"query": "iphone 8",
"operator": "and"
}
}
},
{
"term": {
"keywordName.raw": {
"value": "iphone 8"
}
}
}
]
}
},
"size": 10
}
这篇关于ElasticSearch Analyzer文本字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!