问题描述
我有一个与这个问题密切相关的问题.
在我的模式中,我有一个字段
In my schema I have a field
<field name="text" type="textgen" indexed="true" stored="true" required="true"/>
这给出了一个精确匹配,即.禁用茎
This gives an exact match, ie. stemming disabled
在配置为textgen时可以搜索该单词的其他变体形式
Is it possible, while configured to textgen to search for other variants of the word
eat〜0会发出类似的发音,例如肉,拍子等,但这不是我想要的.
eat~0 will give similar sounding words such as meat, beat etc. but this is not what I want.
我开始认为实现此目的的唯一方法是用textgen之外的其他内容添加另一个字段,但是如果有更简单的方法,我很想听听它.
I'm starting to think that the only way to achieve this is to add another field with something other then textgen but if there is a simpler way I am very interested to hear it.
推荐答案
使用copyfield
语句是Solr中的常规方法.由于stemming
正是您所要询问的答案,因此,我建议您使用它.如果您担心索引大小,可以设置stored=false
.
Using copyfield
statements is the normal approach in Solr. Since stemming
is the answer to exactly what you're asking, this is what I recommend you to use. You can set stored=false
if you are worried about index size.
您还可以使用lemmatisation
,这与词干法相反-在其中您添加所有词形变化的词.通常在搜索查询上执行此操作,例如将eat
扩展为eat, eats, eating
等.
You might also use lemmatisation
, which is the opposite of stemming - where you instead add a words all inflected forms. This is typically performed on the search query, expanding e.g., eat
to eat, eats, eating
etc.
第三个选择可能是使用通配符搜索,尽管我不鼓励这样做.尤其重要,因为它绕过了目标字段的所有架构配置的过滤器.
The third alternative might be to use wildcard search, although I wouldn't encourage it. Not least since it bypasses all schema configured filters for the target field.
这篇关于在Solr中进行精确单词搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!