本文介绍了使用Django干草堆自动完成与elasticsearch搜索数字/数字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Django干草堆由Elasticsearch为自动完成的支持,而我无法在一个领域寻找数字。

I'm using Django Haystack backed by Elasticsearch for autocomplete, and I'm having trouble searching for digits in a field.

例如,我有一个字段叫做名'上有一些像这样的值的对象类型:

For example, I have a field called 'name' on an object type that has some values like this:

['NAME', 'NAME2', 'NAME7', 'ANOTHER NAME 8', '7342', 'SOMETHING ELSE', 'LAST ONE 7']

和我想使用自动完成搜索,在名称中的数字7的所有对象。

and I'd like to use autocomplete to search for all objects with the number '7' in the name.

我建立了我的search_index与此字段:

I've set up my search_index with this field:

name_auto = indexes.EdgeNgramField(model_attr='name')

和我使用像这样的搜索查询:

and I'm using a search query like so:

SearchQuerySet().autocomplete(name_auto='7')

不过,这没有搜索到任何结果。我相信这是因为边缘NGRAM分词为elasticsearch默认为小写,这完全抛出的数字。

However, this search returns no results. I believe this is because the edge-ngram tokenizer for elasticsearch defaults to "lowercase", which throws out digits entirely.

所以,我发现,这使得定制草堆/ elasticsearch后端,但我不能似乎正确配置ELASTICSEARCH_INDEX_SETTINGS得到我想要的功能。

So, I found elasticstack, which allows customizing the haystack/elasticsearch backend, but I can't seem to configure the ELASTICSEARCH_INDEX_SETTINGS correctly to get the functionality I want.

默认设置是这样的:

ELASTICSEARCH_INDEX_SETTINGS = {
    'settings': {
        "analysis": {
            "analyzer": {
                "synonym_analyzer" : {
                    "type": "custom",
                    "tokenizer" : "standard",
                    "filter" : ["synonym"]
                },
                "ngram_analyzer": {
                    "type": "custom",
                    "tokenizer": "lowercase",
                    "filter": ["haystack_ngram", "synonym"]
                },
                "edgengram_analyzer": {
                    "type": "custom",
                    "tokenizer": "lowercase",
                    "filter": ["haystack_edgengram"]
                }
            },
            "tokenizer": {
                "haystack_ngram_tokenizer": {
                    "type": "nGram",
                    "min_gram": 3,
                    "max_gram": 15,
                },
                "haystack_edgengram_tokenizer": {
                    "type": "edgeNGram",
                    "min_gram": 2,
                    "max_gram": 15,
                    "side": "front"
                }
            },
            "filter": {
                "haystack_ngram": {
                    "type": "nGram",
                    "min_gram": 3,
                    "max_gram": 15
                },
                "haystack_edgengram": {
                    "type": "edgeNGram",
                    "min_gram": 2,
                    "max_gram": 15
                },
                "synonym" : {
                    "type" : "synonym",
                    "ignore_case": "true",
                    "synonyms_path" : "synonyms.txt"
                }
            }
        }
    }
}

我试着改变edgengram_analyzer块在很多没有成功的方式,并加入像这样

I've tried to alter the edgengram_analyzer block in a number of ways without success, and adding something like this

"token_chars": [ "letter", "digit" ]

以haystack_ngram_tokenizer并没有任何工作。

to the "haystack_ngram_tokenizer" has not worked either.

有人可以帮助我确定如何使用草堆/ elasticsearch /自动搜索数字?或者将我的'名字'字段拆分为所有可能的n-gram自己,然后用一个标准的匹配搜索?任何帮助将大大AP preciated。

Can someone help me determine how to use haystack/elasticsearch/autocomplete to search for digits? Or will I have to split the 'name' field into all possible n-grams myself and then use a standard matching search? Any help would be greatly appreciated.

非常感谢!

推荐答案

有一个解决方案,它可以帮助我:
http://silentsokolov.github.io/2014/09/03/django-haystack-elasticsearch-prombiemy-avtodopolnieniia.html

There is a solution which helps me:http://silentsokolov.github.io/2014/09/03/django-haystack-elasticsearch-prombiemy-avtodopolnieniia.html

该文件是写在俄郎,所以使用谷歌翻译。

The document is written in Russian lang, so use Google Translation.

这篇关于使用Django干草堆自动完成与elasticsearch搜索数字/数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 10:11
查看更多