本文介绍了德语的'ue'-> Lucene中的'u'转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于在Lucene中处理德国变音符,我有两个问题:

I have two questions regarding handling German umlauts in Lucene:

  1. 我正在尝试找到一种方法,将写为'ue','ae'等的德国Umlauts转换为折叠形式的'u','a'等.这是由GermanAnalyzer(和它使用的German2StemFilter)完成的,但是不幸的是,它也确实存在茎,这在我的情况下是非常不希望的.还有其他过滤器只能执行'ue'->'u'转换吗?

  1. I'm trying to find a way to convert German Umlauts written as 'ue', 'ae', etc to folded form 'u', 'a' and so on.This is done by GermanAnalyzer (and German2StemFilter used by it), but unfortunately it also does stemming which is very undesired in my case.Is there any other filter that can do only the 'ue' -> 'u' conversion?

是否有任何过滤器可以转换'ü'->'ue'(不像ASCIIFoldingFilter那样转换'u')?我想要实现的是,只要用户搜索über"或"ueber",而不是"uber",就应该在索引中找到单词über".

Is there any filter that does 'ü' -> 'ue' (NOT 'u' like ASCIIFoldingFilter does) conversion? What I'm trying to achieve is that word "über" should be found in the index whenever the user searches for " über" or "ueber" , but NOT "uber".

推荐答案

german2的算法,但不加限制:

german2's algorithm but without the stemming:

https://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html

这篇关于德语的'ue'-> Lucene中的'u'转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 02:42