问题描述
有两个表:
授权联系人(auth_contacts
):
(
userid varchar
contacts jsonb
)
contacts
包含属性为{contact_id, type}
discussion
:
(
contact_id varchar
discussion_id varchar
discussion_details jsonb
)
表auth_contacts
至少具有10万条记录,使其不适合使用JSONB类型,因为它会使记录数量增加一倍或三倍.
The table auth_contacts
has at least 100k records making it non JSONB type is not appropriate according as it would double or triple the amount of records.
auth_contacts
的样本数据:
userid | contacts
'11111' | '{"contact": [{"type": "type_a", "contact_id": "1-A-12"}
, {"type": "type_b", "contact_id": "1-A-13"}]}'
discussion
表具有500万个奇数记录.
discussion
table has 5 million odd records.
我想在discussion.contact_id
(关系列)上加入联系人ID,该联系人ID是auth_contacts.contacts
中json对象数组内的json对象.
I want to join on discussion.contact_id
(relational column) with contact id which a json object inside array of json objects in auth_contacts.contacts
.
一种非常粗略的方法是:
One very crude way is:
SELECT *
FROM discussion d
JOIN (SELECT userid, JSONB_OBJECT_KEYS(a.contacts) AS auth_contact
FROM auth_contacts a) AS contacts
ON (d.contact_id = contacts.auth_contact::text)
这实际上是在运行时创建(内部sql)userid vs contact id表(这是我要避免的,因此使用JSONB数据类型对于具有大量记录的用户的此查询需要花费26 +秒的时间,这并不是一件好事.尝试了其他几种方法: PostgreSQL 9.4 :在数组内的JSON字段ID上聚集/联接表
What this does is actually at runtime create (inner sql) userid vs contact id table (Which is what I was avoiding and hence went for JSONB data typeThis query for a user with large records takes 26 + seconds which is not all good.Tried a few other ways: PostgreSQL 9.4: Aggregate / Join table on JSON field id inside array
但是应该有一种更清洁,更好的方法,就像联接d.contact_id = contacts -> contact -> contact_id?
当我尝试此操作时,不会产生任何结果.
But there should be a cleaner and better way which would be as simple asJOIN d.contact_id = contacts -> contact -> contact_id?
When I try this, it doesn't yield any results.
在网上搜索时,这似乎是一项繁琐的工作?
When searching the net this seems to be a pretty cumbersome task?
推荐答案
概念证明
您的粗略方法"实际上是行不通的.这是另一种粗略的方法:
Proof of concept
Your "crude way" doesn't actually work. Here is another crude way that does:
SELECT *
FROM auth_contacts a
, jsonb_to_recordset(a.contacts->'contact') AS c(contact_id text)
JOIN discussion d USING (contact_id);
如前所述,您还可以使用包含运算符@>
:
As has been commented, you can also formulate a join condition with the contains operator @>
:
SELECT *
FROM auth_contacts a
JOIN discussion d ON a.contacts->'contact'
@> json_build_array(json_build_object('contact_id', d.contact_id))::jsonb
但是要使用JSON创建功能,而不要使用字符串连接.看起来很麻烦,但如果受功能性jsonb_path_ops GIN 索引支持,实际上会非常快:
But rather use JSON creation functions than string concatenation. Looks cumbersome but will actually be very fast if supported with a functional jsonb_path_ops GIN index:
CREATE INDEX auth_contacts_contacts_gin_idx ON auth_contacts
USING gin ((contacts->'contact') jsonb_path_ops);
详细信息:
- Index for finding an element in a JSON array
- Postgres 9.4 jsonb array as table
这一切都很有趣,但是这里的问题是关系模型.您的索赔:
This is all fascinating to play with, but the problem here is the relational model. Your claim:
与正确的相反.包装将表联接到JSON文档类型所需的ID是废话.通过多对多关系规范化表,并将在数据库内部使用的所有ID分别实现为具有适当数据类型的单独列.基础:
is the opposite of what's right. It's nonsense to wrap IDs you need for joining tables into a JSON document type. Normalize your table with a many-to-many relationship and implement all IDs you are working with inside the DB as separate columns with appropriate data type. Basics:
- How to perform update operations on columns of type JSONB in Postgres 9.4
- How to implement a many-to-many relationship in PostgreSQL?
这篇关于使用JSONB列内的值联接表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!