问题描述
我使用Django干草堆由Elasticsearch为自动完成的支持,而我无法在一个领域寻找数字。
I'm using Django Haystack backed by Elasticsearch for autocomplete, and I'm having trouble searching for digits in a field.
例如,我有一个字段叫做名'上有一些像这样的值的对象类型:
For example, I have a field called 'name' on an object type that has some values like this:
['NAME', 'NAME2', 'NAME7', 'ANOTHER NAME 8', '7342', 'SOMETHING ELSE', 'LAST ONE 7']
和我想使用自动完成搜索,在名称中的数字7的所有对象。
and I'd like to use autocomplete to search for all objects with the number '7' in the name.
我建立了我的search_index与此字段:
I've set up my search_index with this field:
name_auto = indexes.EdgeNgramField(model_attr='name')
和我使用像这样的搜索查询:
and I'm using a search query like so:
SearchQuerySet().autocomplete(name_auto='7')
不过,这没有搜索到任何结果。我相信这是因为边缘NGRAM分词为elasticsearch默认为小写,这完全抛出的数字。
However, this search returns no results. I believe this is because the edge-ngram tokenizer for elasticsearch defaults to "lowercase", which throws out digits entirely.
所以,我发现,这使得定制草堆/ elasticsearch后端,但我不能似乎正确配置ELASTICSEARCH_INDEX_SETTINGS得到我想要的功能。
So, I found elasticstack, which allows customizing the haystack/elasticsearch backend, but I can't seem to configure the ELASTICSEARCH_INDEX_SETTINGS correctly to get the functionality I want.
默认设置是这样的:
ELASTICSEARCH_INDEX_SETTINGS = {
'settings': {
"analysis": {
"analyzer": {
"synonym_analyzer" : {
"type": "custom",
"tokenizer" : "standard",
"filter" : ["synonym"]
},
"ngram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_ngram", "synonym"]
},
"edgengram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_edgengram"]
}
},
"tokenizer": {
"haystack_ngram_tokenizer": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15,
},
"haystack_edgengram_tokenizer": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15,
"side": "front"
}
},
"filter": {
"haystack_ngram": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15
},
"haystack_edgengram": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15
},
"synonym" : {
"type" : "synonym",
"ignore_case": "true",
"synonyms_path" : "synonyms.txt"
}
}
}
}
}
我试着改变edgengram_analyzer块在很多没有成功的方式,并加入像这样
I've tried to alter the edgengram_analyzer block in a number of ways without success, and adding something like this
"token_chars": [ "letter", "digit" ]
以haystack_ngram_tokenizer并没有任何工作。
to the "haystack_ngram_tokenizer" has not worked either.
有人可以帮助我确定如何使用草堆/ elasticsearch /自动搜索数字?或者将我的'名字'字段拆分为所有可能的n-gram自己,然后用一个标准的匹配搜索?任何帮助将大大AP preciated。
Can someone help me determine how to use haystack/elasticsearch/autocomplete to search for digits? Or will I have to split the 'name' field into all possible n-grams myself and then use a standard matching search? Any help would be greatly appreciated.
非常感谢!
推荐答案
有一个解决方案,它可以帮助我:
http://silentsokolov.github.io/2014/09/03/django-haystack-elasticsearch-prombiemy-avtodopolnieniia.html
There is a solution which helps me:http://silentsokolov.github.io/2014/09/03/django-haystack-elasticsearch-prombiemy-avtodopolnieniia.html
该文件是写在俄郎,所以使用谷歌翻译。
The document is written in Russian lang, so use Google Translation.
这篇关于使用Django干草堆自动完成与elasticsearch搜索数字/数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!