本文介绍了如何为我的自定义文本搜索配置正确创建同义词库词典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用PostgreSQL 11.8.对于Postgres,我使用docker映像postgres:11-alpine.我想为基于某些单词的表达式创建自定义的全文本搜索字典,例如hello world应该变成hw.

I use PostgreSQL 11.8. And for Postgres, I use the docker image postgres:11-alpine. I want to create a custom full text search dictionary for expressions which are based on some words, like hello world should become hw.

首先,我有一个自定义的全文本搜索配置my_swedish:

First of all I have a custom full text search configuration my_swedish:

CREATE TEXT SEARCH CONFIGURATION my_swedish (
   COPY = swedish
);

ALTER TEXT SEARCH CONFIGURATION my_swedish
   DROP MAPPING FOR hword_asciipart;
ALTER TEXT SEARCH CONFIGURATION my_swedish
   DROP MAPPING FOR hword_part;

,对于此配置,我想创建和使用字典.为此,我遵循PostgreSQL手册:

and for this configuration I want to create and use a dictionary. For that I follow the PostgreSQL manual:

CREATE TEXT SEARCH DICTIONARY thesaurus_my_swedish (
    TEMPLATE = thesaurus,
    DictFile = thesaurus_my_swedish,
    Dictionary = pg_catalog.swedish_stem
);

并面对

ERROR:  could not open thesaurus file "/usr/local/share/postgresql/tsearch_data/thesaurus_my_swedish.ths": No such file or directory

然后我手动创建了文件:

I then created the file manually:

touch /usr/local/share/postgresql/tsearch_data/thesaurus_astro.ths

然后:

ALTER TEXT SEARCH CONFIGURATION my_swedish
    ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
    WITH thesaurus_my_swedish;

 ERROR:  text search configuration "my_swedish" does not exist

当我将其更改为默认swedish

ALTER TEXT SEARCH CONFIGURATION swedish
    ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
    WITH thesaurus_my_swedish;

我得到了错误:

ERROR:  text search dictionary "thesaurus_my_swedish" does not exist

如何为我的自定义测试搜索配置正确创建同义词库词典?

How to correctly create a thesaurus dictionary for my custom test search configuration?

更新我在文件thesaurus_my_swedish.ths数据hello world : hw中添加了

UPDATEI added in my file thesaurus_my_swedish.ths data hello world : hw And now

SELECT to_tsvector('my_swedish', 'hello world');

返回了'hw':1

但是其他单词呢?由于to_tsvector('my_swedish', 'hello test')返回空,因此应像默认瑞典语一样返回

But what about about othr words ? Because to_tsvector('my_swedish', 'hello test') return empty, it should be returned like default swedish

SELECT to_tsvector('swedish', 'hello test');
'hello':1 'test':2

怎么了?

更新

我了解,也需要添加pg_catalog.swedish_stem

ALTER TEXT SEARCH CONFIGURATION my_swedish
   ALTER MAPPING FOR asciihword, asciiword, hword, word
   WITH thesaurus_my_swedish, pg_catalog.swedish_stem;

推荐答案

您所做的一切都正确,但有一些例外:

You did everything right, with a few exceptions:

  • thesaurus_my_swedish.ths不应为空,而应包含以下规则(摘自您的示例):

  • thesaurus_my_swedish.ths should not be empty, but contain rules like this (taken from your example):

hello world : hw

  • 对于现在使用swedish_stem的所有令牌类型,应该使用新词典,即

  • You should use the new dictionary for all token types that now use swedish_stem, that is

    ALTER TEXT SEARCH CONFIGURATION my_swedish
       ALTER MAPPING FOR asciihword, asciiword, hword, word
       WITH thesaurus_my_swedish, swedish_stem;
    

  • 此错误是神秘的,应该不会发生:

    This error is mysterious and should not have happened:

    ERROR:  text search configuration "my_swedish" does not exist
    

    也许您连接到错误的数据库,或者您再次删除了配置,或者它不在search_path上,所以您必须使用其架构来限定它.使用psql中的\dF *.*列出所有现有配置.

    Perhaps you connected to the wrong database, or you dropped the configuration again, or it is not on the search_path and you have to qualify it with its schema. Use \dF *.* in psql to list all existing configurations.

    当然,您必须先创建字典,然后才能在文本搜索配置中使用它.

    Of course you have to create the dictionary before you can use it in a text search configuration.

    请勿修改pg_catalog中的配置,升级后这些修改将丢失.

    Don't modify the configurations in pg_catalog, such modifications would be lost after an upgrade.

    这篇关于如何为我的自定义文本搜索配置正确创建同义词库词典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

    08-20 08:30
    查看更多