问题描述
我想解析支持hunspell格式的aff
和dic
文件的开放办公室.
I would like to parse open office supporting hunspell formatted aff
and dic
files.
英语aff
和dic
文件: http://extensions.openoffice.org/en/project/english-dictionaries-apache-openoffice
我想扫描给定.dic
文件的每一行,并使用提供的.aff
文件生成每一行的每个可能的单词
I want to scan each line of the given .dic
file and generate every possible word of the each line with the provided .aff
file
我该怎么做?
我已经安装了NHunspell框架,但没有该功能: https://www.nuget .org/packages/NHunspell/
I have installed NHunspell framework but it does not have that feature : https://www.nuget.org/packages/NHunspell/
以英语为例,请考虑
make/UAGS
make可以是make, made, makes, making
等
现在我需要解析器来给我所有这些组合.我如何获得它们?真的很
Now i need parser to give me all these combinations. How can i obtain them? Ty very much
所以基本上我想扫描字典的每一行并从该行的单词中生成所有可能的单词,我不知道该怎么做
So basically i want to scan each line of the dictionary and generate all possible words from the word of that line and i dont know how can i do that
我也可以编写自己的解析器,但是在我看来规则非常复杂,没有关于此的详细且简单的文档
I can also write my own parsers, but it seems to me rules are pretty complex and there are no detailed and easy documentation about this
这基本上是我想要的.图像解释得很清楚
Here what i want basically. The image explains very clearly
提供analyze/ADSG
,en.dic
和en.aff
文件并获得以下所有单词
Giving analyze/ADSG
, en.dic
and en.aff
file and obtaining all the following words
analyze, analyzes, analyzing, analyzed, reanalyze, reanalyzes, reanalyzing, reanalyzed
推荐答案
如果您想要整个数据库,则可以执行unmunch
:
If you want the entire database you may execute unmunch
:
unmunch dictionary.dic dictionary.aff
请注意,当前在hunspell中实施unmunch的操作限制为最大单词数,affs和所生成单词的长度.因此,如果目标语言超出了取消限制的范围,则取消锁定可能会失败.
Note that the current implementation of unmunch in hunspell has a limitation of maximum number of words, affs, and length of generated words. So, unmunch may fail if the target language is beyond the limits of unmunch.
如果只想从一个条目中生成可能单词的列表,则可以使用wordforms
:
If you want just the list of possible words that can be generated from an entry, you may use wordforms
:
wordforms dictionary.aff dictionary.dic word
这篇关于如何从给定的hunspell词典中获取所有可能的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!