问题描述
我使用PostgreSQL 11.8.对于Postgres,我使用docker映像postgres:11-alpine
.我想为基于某些单词的表达式创建自定义的全文本搜索字典,例如hello world
应该变成hw
.
I use PostgreSQL 11.8. And for Postgres, I use the docker image postgres:11-alpine
. I want to create a custom full text search dictionary for expressions which are based on some words, like hello world
should become hw
.
首先,我有一个自定义的全文本搜索配置my_swedish
:
First of all I have a custom full text search configuration my_swedish
:
CREATE TEXT SEARCH CONFIGURATION my_swedish (
COPY = swedish
);
ALTER TEXT SEARCH CONFIGURATION my_swedish
DROP MAPPING FOR hword_asciipart;
ALTER TEXT SEARCH CONFIGURATION my_swedish
DROP MAPPING FOR hword_part;
,对于此配置,我想创建和使用字典.为此,我遵循PostgreSQL手册:
and for this configuration I want to create and use a dictionary. For that I follow the PostgreSQL manual:
CREATE TEXT SEARCH DICTIONARY thesaurus_my_swedish (
TEMPLATE = thesaurus,
DictFile = thesaurus_my_swedish,
Dictionary = pg_catalog.swedish_stem
);
并面对
ERROR: could not open thesaurus file "/usr/local/share/postgresql/tsearch_data/thesaurus_my_swedish.ths": No such file or directory
然后我手动创建了文件:
I then created the file manually:
touch /usr/local/share/postgresql/tsearch_data/thesaurus_astro.ths
然后:
ALTER TEXT SEARCH CONFIGURATION my_swedish
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
WITH thesaurus_my_swedish;
ERROR: text search configuration "my_swedish" does not exist
当我将其更改为默认swedish
ALTER TEXT SEARCH CONFIGURATION swedish
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
WITH thesaurus_my_swedish;
我得到了错误:
ERROR: text search dictionary "thesaurus_my_swedish" does not exist
如何为我的自定义测试搜索配置正确创建同义词库词典?
How to correctly create a thesaurus dictionary for my custom test search configuration?
更新我在文件thesaurus_my_swedish.ths
数据hello world : hw
中添加了
UPDATEI added in my file thesaurus_my_swedish.ths
data hello world : hw
And now
SELECT to_tsvector('my_swedish', 'hello world');
返回了'hw':1
但是其他单词呢?由于to_tsvector('my_swedish', 'hello test')
返回空,因此应像默认瑞典语一样返回
But what about about othr words ? Because to_tsvector('my_swedish', 'hello test')
return empty, it should be returned like default swedish
SELECT to_tsvector('swedish', 'hello test');
'hello':1 'test':2
怎么了?
更新
我了解,也需要添加pg_catalog.swedish_stem
ALTER TEXT SEARCH CONFIGURATION my_swedish
ALTER MAPPING FOR asciihword, asciiword, hword, word
WITH thesaurus_my_swedish, pg_catalog.swedish_stem;
推荐答案
您所做的一切都正确,但有一些例外:
You did everything right, with a few exceptions:
-
thesaurus_my_swedish.ths
不应为空,而应包含以下规则(摘自您的示例):
thesaurus_my_swedish.ths
should not be empty, but contain rules like this (taken from your example):
hello world : hw
对于现在使用swedish_stem
的所有令牌类型,应该使用新词典,即
You should use the new dictionary for all token types that now use swedish_stem
, that is
ALTER TEXT SEARCH CONFIGURATION my_swedish
ALTER MAPPING FOR asciihword, asciiword, hword, word
WITH thesaurus_my_swedish, swedish_stem;
此错误是神秘的,应该不会发生:
This error is mysterious and should not have happened:
ERROR: text search configuration "my_swedish" does not exist
也许您连接到错误的数据库,或者您再次删除了配置,或者它不在search_path
上,所以您必须使用其架构来限定它.使用psql
中的\dF *.*
列出所有现有配置.
Perhaps you connected to the wrong database, or you dropped the configuration again, or it is not on the search_path
and you have to qualify it with its schema. Use \dF *.*
in psql
to list all existing configurations.
当然,您必须先创建字典,然后才能在文本搜索配置中使用它.
Of course you have to create the dictionary before you can use it in a text search configuration.
请勿修改pg_catalog
中的配置,升级后这些修改将丢失.
Don't modify the configurations in pg_catalog
, such modifications would be lost after an upgrade.
这篇关于如何为我的自定义文本搜索配置正确创建同义词库词典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!