问题描述
有没有办法在django haystack索引中限制边缘数据的大小?例如,我创建的ngram如下:
Is there a way to restrict the size of the edge ngrams in django haystack indexing? For example, I create the ngram as follows:
#search_indexes.py
content_auto = indexes.EdgeNgramField(model_attr='name')
但是我不想创建2个字母的ngram,我其实是想设置在4或5的最小值。
But I don't want to create 2 letter ngrams, I actually want to set the min at 4 or 5.
作为背景,我正在使用django-haystack / elasticsearch,带有英文字母的盆景。
As background, I am using django-haystack/elasticsearch, with bonsai on heroku.
推荐答案
您需要做的是覆盖Haystack的ElasticSearch后端的搜索映射。
What you need to do is override the search mapping in Haystack's ElasticSearch backend.
简而言之:扩展ElasticSearch后端,并直接替换或通过 settings.py
导入新的架构映射。
In brief: extend the ElasticSearch backend and either replace directly or by a settings.py
import a new schema mapping.
from django.conf import settings
from haystack.backends.elasticsearch_backend import (ElasticsearchSearchBackend,
ElasticsearchSearchEngine)
class MyElasticBackend(ElasticsearchSearchBackend):
def __init__(self, connection_alias, **connection_options):
super(ConfigurableElasticBackend, self).__init__(
connection_alias, **connection_options)
MY_SETTINGS = {
'settings': {
"analysis": {
"analyzer": {
"ngram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_ngram"]
},
"edgengram_analyzer": {
"type": "custom",
"tokenizer": "lowercase",
"filter": ["haystack_edgengram"]
}
},
"tokenizer": {
"haystack_ngram_tokenizer": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15,
},
"haystack_edgengram_tokenizer": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15,
"side": "front"
}
},
"filter": {
"haystack_ngram": {
"type": "nGram",
"min_gram": 3,
"max_gram": 15
},
"haystack_edgengram": {
"type": "edgeNGram",
"min_gram": 5,
"max_gram": 15
}
}
}
}
}
setattr(self, 'DEFAULT_SETTINGS', MY_SETTINGS)
class ConfigurableElasticSearchEngine(ElasticsearchSearchEngine):
backend = MyElasticBackend
有关更完整的说明,请参阅我的自定义搜索映射。
For a fuller explanation see my write up about extending the ElasticSearch backend to customize the search mapping.
这篇关于EdgeNgramField django haystack中的最小和最大字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!