前言
Whoosh:搜索引擎 jieba:分词器 django-heystack:支持引擎的第三方app
准备
Pip3 install whoosh
Pip3 install jieba
Pip3 install django-haystack
配置
将 haystack 加入 INSTALLED_APP中:
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
#其它app ...
'search_liu',
'haystack',
]
再加入如下配置:
project/settings.py
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'search_liu.whoosh_cn_backend.WhooshEngine', #使用whoosh搜索引擎
'PATH': os.path.join(BASE_DIR, 'whooshindex'),
},
}
HAYSTACK_SEARCH_RESULTS_PER_PAGE = 10 #每十项结果为一页
HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'
'ENGINE': 'search_liu.whoosh_cn_backend.WhooshEngine' 虽然目前这个引擎还不存在,但我们接下来会创建它。
'PATH' 索引文件需要存放的位置,我们设置为项目根目录 BASE_DIR
下的 whoosh_index 文件夹(在建立索引是会自动创建)。
配置建立索引文件
在app下建立 search_indexes.py 文件并写上如下代码:
class newsIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
def get_model(self): return news def index_queryset(self, using=None): return self.get_model().objects.filter(newsState=2) #限制搜索条件
因为我要检索多张表,所以我选择在search这个app下的 search_indexes.py 写了三个表名+index类 ,之后就会同时对这三个表建立索引文件。
然后在 templates/search/indexes/youapp/\<model_name>_text.txt 中写下需要检索的字段,多张表就有多个txt文件。
{{ object.title }}
{{ object.mainBody }}
修改搜索引擎为中文分词
在 search app 下建立 ChineseAnalyser.py 文件,写下如下的代码:
import jieba
from whoosh.analysis import Tokenizer, Token
class ChineseTokenizer(Tokenizer):
def __call__(self, value, positions=False, chars=False,
keeporiginal=False, removestops=True,
start_pos=0, start_char=0, mode='', **kwargs):
t = Token(positions, chars, removestops=removestops, mode=mode,
**kwargs)
seglist = jieba.cut_for_search(value)
for w in seglist:
t.original = t.text = w
t.boost = 1.0
if positions:
t.pos = start_pos + value.find(w)
if chars:
t.startchar = start_char + value.find(w)
t.endchar = start_char + value.find(w) + len(w)
yield t
def chinese_analyzer():
return ChineseTokenizer()
在 python 下的 Lib\site-packages\haystack\backends 目录中找到 whoosh_backend.py 文件 复制到 search app 下,并改名为 whoosh_cn_backend.py
在其中加入
from search import ChineseAnalyser
并找到语句并做修改如下:
schema_fields[field_class.index_fieldname] = TEXT(stored=True, analyzer=ChineseAnalyser.chinese_analyzer(), field_boost=field_class.boost, sortable=True)
最后运行命令:python3 manage.py rebuild_index 就可以建立索引文件了。
创建搜索表单
<div class="input-group" style="width:370px">
<div style="float:right">
<form action="" id="search_form" method="get" onsubmit='return sub_search_form()'>
<!--不要改name='q'-->
<input type="text" class="form-control" style="width:229px;float:left;" name="q" placeholder=" 请输入关键字">
<span class="input-group-btn" >
<button class="btn btn-info" id="search" style="width:60px;height:34px;background-color:purple;border-color:purple" type="submit"><i class="glyphicon glyphicon-search"></i></button>
</span>
</form>
</div>
<!--不要把select标签放进form表单中-->
<select id="option" class="form-control" style="height:32px;width:77px;">
<option value="0">全部</option>
<option value="1">新闻</option>
<option value="2">公告</option>
<option value="3">论文</option>
</select>
<!--不要把select标签放进form表单中-->
</div>
后台函数处理
以上表单通过 js 向后台发起请求,相关js 如下:
function sub_search_form(){
//1:新闻 2:公告 3:论文
var obj = document.getElementById('option');
var form = document.getElementById('search_form');
var value = obj.value;
//alert(value)
switch (value){
case '0': form.action = '/search/';
break;
case '1': form.action = '/search/news/';
break;
case '2': form.action = '/search/announcement/';
break;
case '3': form.action = '/search/thesis_information/';
break;
default:break;
}
}
search/views.py 内容如下:
from haystack.generic_views import SearchView
from haystack.query import SearchQuerySet
from web.models import news, announcement, thesis_information
model_map ={'news': news, 'announcement': announcement, 'thesis_information': thesis_information}\
class VisitorSearchView(SearchView):
def get_queryset(self):
queryset = super(VisitorSearchView, self).get_queryset()
self.context_object_name = 'search_list'
# 获取model名
model_name = self.kwargs.get('model')
#如果分表查询
if model_name:
model = model_map[model_name]
queryset = SearchQuerySet().models(model)
if model_name == 'thesis_information':
self.context_object_name = 'search_thesis_list'
#不分表查询
else:
self.template_name = 'search/search_all.html'
return queryset
search/urls.py
from django.urls import path
from search.views import VisitorSearchView
urlpatterns = [
path('<str:model>/', VisitorSearchView.as_view()),
path('', VisitorSearchView.as_view()),
]
参考
https://www.cnblogs.com/fuhuixiang/p/4488029.html
https://www.zmrenwu.com/courses/django-blog-tutorial/materials/27/
https://www.cnblogs.com/ftl1012/p/10397553.html
https://github.com/stormsha/blog
https://stormsha.com/
Searchview https://blog.csdn.net/BetrayArmy/article/details/83512700