我正在尝试运行全文搜索操作,例如to_tsvectorto_tsquery,等等,并且需要80多种语言的dictionaries
Postgres似乎只有16种语言配置,另外两种是我正在测试的中文(jiebacfgtestzhcg又名ZHParse)。我正在寻找除此之外的其他语言的文档或存储库。

mydatabase=# \dF

               List of text search configurations
   Schema   |    Name    |              Description
------------+------------+---------------------------------------
 pg_catalog | danish     | configuration for danish language
 pg_catalog | dutch      | configuration for dutch language
 pg_catalog | english    | configuration for english language
 pg_catalog | finnish    | configuration for finnish language
 pg_catalog | french     | configuration for french language
 pg_catalog | german     | configuration for german language
 pg_catalog | hungarian  | configuration for hungarian language
 pg_catalog | italian    | configuration for italian language
 pg_catalog | norwegian  | configuration for norwegian language
 pg_catalog | portuguese | configuration for portuguese language
 pg_catalog | romanian   | configuration for romanian language
 pg_catalog | russian    | configuration for russian language
 pg_catalog | simple     | simple configuration
 pg_catalog | spanish    | configuration for spanish language
 pg_catalog | swedish    | configuration for swedish language
 pg_catalog | turkish    | configuration for turkish language
 public     | jiebacfg   | configuration for jieba
 public     | testzhcfg  |
(18 rows)

最佳答案

正如pozs所评论的,您可以从OpenOffice(或LibreOffice)扩展名中获取字典文件。从documentation
要创建Ispell字典,请执行以下步骤:
下载字典配置文件。OpenOffice扩展文件的扩展名为.oxt。必须提取.aff和.dic文件,将扩展名更改为.append和.dict。对于某些词典文件,还需要使用命令将字符转换为UTF-8编码(例如,对于挪威语词典):
iconv-f ISO_-1-t UTF-8-o nn_编号。粘贴nn_编号aff
iconv-f ISO澷8859-1-t UTF-8-o nn_no.dict nn_no.dic
将文件复制到$SHAREDIR/tsearch\u数据目录
使用以下命令将文件加载到PostgreSQL中:
创建文本搜索词典英语拼音(
模板=ispell,
DictFile=我们,
AffFile=en_我们,
Stopwords=英语);
此外,还有一个扩展列表,提供了字典安装的简单方法。您可以从github下载它们。

07-24 19:13
查看更多