问题描述
我们目前正在使用 Lucene 2.3.2 并希望迁移到 3.4.0 .我们有自己的自定义 Tokenizer 使用 Java CC 生成,自从我们开始使用 Lucene 以来就一直在使用它,我们希望继续使用相同的行为.我感谢任何资源的指针,这些资源涉及为语法构建新的 TokenStream API 的 Tokenizer.
We are currently using Lucene 2.3.2 and want to migrate to 3.4.0 . We have our own custom Tokenizer generated using Java CC which has been in use ever since we started using Lucene and we want to continue with the same behavior. I appreciate pointers to any resources that deal with building a Tokenizer for new TokenStream API from grammar.
更新:
我在 http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=log&pathrev=692211.修改语法以适应我们的要求并使用 jflex http://jflex.de/
I found the grammar used to generate StandardTokenizer at http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=log&pathrev=692211. Modified grammar to suit to our requirements and generated java code using jflex http://jflex.de/
推荐答案
我在 http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=log&pathrev=692211.修改语法以适应我们的要求并使用 jflex http://jflex.de/
I found the grammar used to generate StandardTokenizer at http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/analysis/standard/StandardTokenizerImpl.jflex?view=log&pathrev=692211. Modified grammar to suit to our requirements and generated java code using jflex http://jflex.de/
这篇关于使用 JFlex/Java CC 为新的 TokenStream API 生成自定义 Tokenizer的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!