在postgresql中,我有两个具有多对多关联的表。第一个表包含活动,这些活动可能计为零或更多原因:
CREATE TABLE activity (
id integer NOT NULL,
-- other fields removed for readability
);
CREATE TABLE reason (
id varchar(1) NOT NULL,
-- other fields here
);
为了执行关联,在这两个表之间存在一个联接表:
CREATE TABLE activity_reason (
activity_id integer NOT NULL, -- refers to activity.id
reason_id varchar(1) NOT NULL, -- refers to reason.id
CONSTRAINT activity_reason_activity FOREIGN KEY (activity_id) REFERENCES activity (id),
CONSTRAINT activity_reason_reason FOREIGN KEY (reason_id) REFERENCES reason (id)
);
我想计算一下活动和原因之间可能存在的联系。假设我在表格中有这些记录:
+--------------+------------+
| activity_id | reason_id |
+--------------+------------+
| 1 | A |
| 1 | B |
| 2 | A |
| 2 | B |
| 3 | A |
| 4 | C |
| 4 | D |
| 4 | E |
+--------------+------------+
我应该有这样的东西:
+-------+---+------+-------+
| count | | | |
+-------+---+------+-------+
| 2 | A | B | NULL |
| 1 | A | NULL | NULL |
| 1 | C | D | E |
+-------+---+------+-------+
或者,最终,类似于:
+-------+-------+
| count | |
+-------+-------+
| 2 | A,B |
| 1 | A |
| 1 | C,D,E |
+-------+-------+
我找不到SQL查询来执行此操作。
最佳答案
我们需要比较排序的原因列表来确定相等的集合。
SELECT count(*) AS ct, reason_list
FROM (
SELECT array_agg(reason_id) AS reason_list
FROM (SELECT * FROM activity_reason ORDER BY activity_id, reason_id) ar1
GROUP BY activity_id
) ar2
GROUP BY reason_list
ORDER BY ct DESC, reason_list;
ORDER BY reason_id
在最里面的子查询中也可以工作,但是添加activity_id
通常更快。我们完全不需要最里面的子查询。这同样有效:
SELECT count(*) AS ct, reason_list
FROM (
SELECT array_agg(reason_id ORDER BY reason_id) AS reason_list
FROM activity_reason
GROUP BY activity_id
) ar2
GROUP BY reason_list
ORDER BY ct DESC, reason_list;
但处理所有或大部分表的速度通常较慢。Quoting the manual:
或者,从已排序的子查询中提供输入值通常可以工作。
我们可以使用
string_agg()
而不是array_agg()
,这适用于您使用varchar(1)
的示例(对于数据类型"char"
,btw,这可能更有效)。不过,它可能会在较长的字符串中失败。聚合值可能不明确。如果
reason_id
是一个integer
(与通常的情况一样),则有另一个更快的解决方案,来自附加模块intarray:SELECT count(*) AS ct, reason_list
FROM (
SELECT sort(array_agg(reason_id)) AS reason_list
FROM activity_reason2
GROUP BY activity_id
) ar2
GROUP BY reason_list
ORDER BY ct DESC, reason_list;
相关,更多解释:
Compare arrays for equality, ignoring order of elements
Storing and comparing unique combinations