为模式匹配搜索索引jsonb数据

为模式匹配搜索索引jsonb数据

本文介绍了为模式匹配搜索索引jsonb数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是对以下内容的跟进:

This is a follow-up to:
Pattern matching on jsonb key/value

我有一张表如下

CREATE TABLE "PreStage".transaction (
  transaction_id serial NOT NULL,
  transaction jsonb
  CONSTRAINT pk_transaction PRIMARY KEY (transaction_id)
);

我的交易jsonb栏中的内容看起来像

The content in my transaction jsonb column looks like

{"ADDR": "abcd", "CITY": "abcd", "PROV": "",
 "ADDR2": "",
 "ADDR3": "","CNSNT": "Research-NA", "CNTRY": "NL", "EMAIL": "@.com",
             "PHONE": "12345", "HCO_NM": "HELLO", "UNQ_ID": "",
             "PSTL_CD": "1234", "HCP_SR_NM": "", "HCP_FST_NM": "",
             "HCP_MID_NM": ""}

我需要搜索查询,如:

SELECT transaction AS data FROM   "PreStage".transaction
WHERE  transaction->>'HCP_FST_NM' ILIKE '%neer%';

但我需要让我的用户灵活地搜索任何键/值。

But I need to give my user flexibility to search any key/value on the fly.

对上一个问题的回答建议将索引创建为:

An answer to the previous question suggested to create index as:

CREATE INDEX idxgin ON "PreStage".transaction
USING gin ((transaction->>'HCP_FST_NM') gin_trgm_ops);

哪个有效,但我也想索引其他键。因此尝试类似:

Which works, but I wanted to index other keys, too. Hence was trying something like:

CREATE INDEX idxgin ON "PreStage".transaction USING gin
((transaction->>'HCP_FST_NM'),(transaction->>'HCP_LST_NM') gin_trgm_ops)

哪个不起作用。这里最好的索引方法是什么,或者我必须为每个键创建一个单独的索引,在这种情况下,如果将新的键/值对添加到数据中,该方法将不是通用的。

Which doesn't work. What would be the best indexing approach here or will I have to create a separate index for each key in which case the approach will not be generic if a new key/value pair is added to the data.

推荐答案

语法错误除了,
混合 一些热门密钥(包含在许多行和/或经常搜索的)以及 更多稀有密钥(包含在少数行和/或很少搜索,新密钥可能会动态弹出)我建议这个组合:

The syntax error that @jjanes pointed out aside,
for a mix of some popular keys (contained in many rows and / or searched often) plus many more rare keys (contained in few rows and / or rarely searched, new keys might pop up dynamically) I suggest this combination:

看起来你好像不会合并通常在一次搜索中使用多个键,并且具有许多键的单个索引将变得非常大且缓慢。所以我会为每个热门密钥创建一个单独的索引。使其成为大多数行中未包含的键的部分索引:

It does not seem like you are going to combine multiple keys in one search often, and a single index with many keys would grow very big and slow. So I would create a separate index for each popular key. Make it a partial index for keys that are not contained in most rows:

CREATE INDEX trans_idxgin_HCP_FST_NM ON transaction  -- contained in most rows
USING gin ((transaction->>'HCP_FST_NM') gin_trgm_ops);

CREATE INDEX trans_idxgin_ADDR ON transaction  -- not in most rows
USING gin ((transaction->>'ADDR') gin_trgm_ops)
WHERE transaction ? 'ADDR';

等等。在我之前的回答中详细说明:

Etc. Like detailed in my previous answer:



  • Pattern matching on jsonb key/value

如果您有许多不同的键和/或动态添加新键,您可以使用基本(默认) jsonb_ops GIN索引:

If you have many different keys and / or new keys are added dynamically, you can cover the rest with a basic (default) jsonb_ops GIN index:

CREATE INDEX trans_idxgin ON "PreStage".transaction USING gin (transaction);

除此之外,这还支持搜索。但是你不能用它来进行模式匹配。

Among other things, this supports the search for keys. But you cannot use it for pattern matching on values.



  • What's the proper index for querying structures in arrays in Postgres jsonb?

结合处理两个索引的谓词:

Combine predicates addressing both indexes:

SELECT transaction AS data
FROM   "PreStage".transaction
WHERE  transaction->>'HCP_FST_NM' ILIKE '%neer%'
AND    transaction ? 'HCP_FST_NM';  -- even if that seems redundant.

第二个条件发生以匹配我们的部分索引。

The second condition happens to match our partial indexes as well.

所以 要么 给定(热门/常用)密钥有一个特定的三元组索引, 至少有一个索引可以找到包含罕见密钥的(少数几个)行 - 然后过滤匹配的值。相同的查询应该为您提供两全其美的优势。

So either there is a specific trigram index for the given (popular / common) key, or there is at least an index to find (the few) rows containing the rare key - and then filter for matching values. The same query should give you the best of both worlds.

确保运行最新版本的Postgres,最近有成本估算的各种更新。 Postgres使用良好的估算和当前的统计数据来选择最佳查询计划至关重要。

Be sure to run the latest version of Postgres, there have been various updates for cost estimates recently. It will be crucial that Postgres works with good estimates and current table statistics to choose the best query plan.

这篇关于为模式匹配搜索索引jsonb数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-18 18:46