本文介绍了Solr中StandardTokenizerFactory和KeywordTokenizerFactory之间的区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Solr的新手。我想知道何时使用 StandardTokenizerFactory KeywordTokenizerFactory

I am new to Solr.I want to know when to use StandardTokenizerFactory and KeywordTokenizerFactory?

I阅读Apache Wiki上的文档,但我没有得到它。

I read the docs on Apache Wiki, but I am not getting it.

有人可以解释StandardTokenizerFactory和KeywordTokenizerFactory 之间的区别吗?

Can anybody explain the difference between StandardTokenizerFactory and KeywordTokenizerFactory?

推荐答案

StandardTokenizerFactory: -

它在空格上进行标记,以及剥离字符

StandardTokenizerFactory :-
It tokenizes on whitespace, as well as strips characters

文档: -

将此项用于您要在该字段上搜索的字段数据。

Would use this for fields where you want to search on the field data.

例如 -

http://example.com/I-am+example?Text=-Hello

将生成7个令牌(以逗号分隔) -

would generate 7 tokens (separated by comma) -

http,example.com,I,am,example,Text,Hello

KeywordTokenizerFactory: -

KeywordTokenizerFactory :-

Keyword Tokenizer根本不分割输入。

没有对字符串执行任何处理,整个字符串被视为单个实体。

这实际上并没有进行任何标记化。它将原始文本作为一个术语返回。

Keyword Tokenizer does not split the input at all.
No processing in performed on the string, and the whole string is treated as a single entity.
This doesn't actually do any tokenization. It returns the original text as one term.

主要用于排序或分面要求,在过滤多个单词时要匹配精确的构面,并排序,因为排序不会对标记化字段起作用。

Mainly used for sorting or faceting requirements, where you want to match the exact facet when filtering on multiple words and sorting as sorting does not work on tokenized fields.

例如。

http://example.com/I-am+example?Text=-Hello

会生成一个令牌 -

would generate a single token -

http://example.com/I-am+example?Text=-Hello

这篇关于Solr中StandardTokenizerFactory和KeywordTokenizerFactory之间的区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 08:34
查看更多