我正在尝试运行全文搜索操作,例如to_tsvector
,to_tsquery
,等等,并且需要80多种语言的dictionaries。
Postgres似乎只有16种语言配置,另外两种是我正在测试的中文(jiebacfg
和testzhcg
又名ZHParse
)。我正在寻找除此之外的其他语言的文档或存储库。
mydatabase=# \dF
List of text search configurations
Schema | Name | Description
------------+------------+---------------------------------------
pg_catalog | danish | configuration for danish language
pg_catalog | dutch | configuration for dutch language
pg_catalog | english | configuration for english language
pg_catalog | finnish | configuration for finnish language
pg_catalog | french | configuration for french language
pg_catalog | german | configuration for german language
pg_catalog | hungarian | configuration for hungarian language
pg_catalog | italian | configuration for italian language
pg_catalog | norwegian | configuration for norwegian language
pg_catalog | portuguese | configuration for portuguese language
pg_catalog | romanian | configuration for romanian language
pg_catalog | russian | configuration for russian language
pg_catalog | simple | simple configuration
pg_catalog | spanish | configuration for spanish language
pg_catalog | swedish | configuration for swedish language
pg_catalog | turkish | configuration for turkish language
public | jiebacfg | configuration for jieba
public | testzhcfg |
(18 rows)
最佳答案
正如pozs所评论的,您可以从OpenOffice(或LibreOffice)扩展名中获取字典文件。从documentation:
要创建Ispell字典,请执行以下步骤:
下载字典配置文件。OpenOffice扩展文件的扩展名为.oxt。必须提取.aff和.dic文件,将扩展名更改为.append和.dict。对于某些词典文件,还需要使用命令将字符转换为UTF-8编码(例如,对于挪威语词典):
iconv-f ISO_-1-t UTF-8-o nn_编号。粘贴nn_编号aff
iconv-f ISO澷8859-1-t UTF-8-o nn_no.dict nn_no.dic
将文件复制到$SHAREDIR/tsearch\u数据目录
使用以下命令将文件加载到PostgreSQL中:
创建文本搜索词典英语拼音(
模板=ispell,
DictFile=我们,
AffFile=en_我们,
Stopwords=英语);
此外,还有一个扩展列表,提供了字典安装的简单方法。您可以从github下载它们。