我有一个具有以下结构的行表name TEXT, favorite_colors TEXT[], group_name INTEGER,其中每一行都有一个每个人最喜欢的颜色和该人所属的组的列表。如何才能GROUP BY group_name并返回每组中最常见颜色的列表?
你能把int[] && int[]int[] & int[]结合起来设置重叠,得到交叉点,然后再做一些其他的计算和排序吗?

最佳答案

又快又脏:

SELECT group_name, color, count(*) AS ct
FROM (
   SELECT group_name, unnest(favorite_colors) AS color
   FROM   tbl
   ) sub
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

如果aLATERAL JOIN
在Postgres 9.3或更高版本中,这是更干净的形式:
SELECT group_name, color, count(*) AS ct
FROM   tbl t, unnest(t.favorite_colors) AS color
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

以上是缩写
...
FROM tbl t
JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE
...

与任何其他INNER JOIN一样,它将排除没有颜色的行(favorite_colors IS NULL)-就像第一个查询一样。
要在结果中包含此类行,请改用:
SELECT group_name, color, count(*) AS ct
FROM   tbl t
LEFT   JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

在下一步中,您可以很容易地聚合每个组的“最常用”颜色,但您需要首先定义“最常用颜色”。。。
最常见的颜色
根据注释,选择出现次数大于3次的颜色。
SELECT t.group_name, color, count(*) AS ct
FROM   tbl t, unnest(t.favorite_colors) AS color
GROUP  BY 1,2
HAVING count(*) > 3
ORDER  BY 1,3 DESC;

要在数组中聚合顶部颜色(按降序排列),请执行以下操作:
SELECT group_name, array_agg(color) AS top_colors
FROM  (
   SELECT group_name, color
   FROM   tbl t, unnest(t.favorite_colors) AS color
   GROUP  BY 1,2
   HAVING count(*) > 3
   ORDER  BY 1, count(*) DESC
   ) sub
GROUP BY 1;

-> SQLfiddle全部演示。

10-07 12:37