使用group by查找数组中最常见的元素

本文介绍了使用group by查找数组中最常见的元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个具有以下结构的行表 name TEXT，favourite_colors TEXT []，group_name INTEGER ，其中每一行都有每个人的收藏夹颜色列表以及人属于。我该如何并返回每个组中最常见的颜色的列表？

int []和& int [] 设置为重叠， int []& int [] 获取交点，然后再进行其他计数和排名？

解决方案

快速和脏：

  SELECT group_name，color，count（*）AS ct 
 FROM（
 SELECT group_name， unnest（favorite_colors）AS颜色
从tbl 
）sub 
 GROUP BY 1,2 
 ORDER BY 1,3 DESC;

最好使用

在Postgres 9.3或更高版本是一种更简洁的表格：

  SELECT group_name，color，count（*） AS ct 
 from tbl t，unnest（t.favorite_colors）AS颜色
 GROUP BY 1,2 
 ORDER BY 1,3 DESC;

上面是

<$ p $的简写p>

 ... 
从tbl t 
加入横向unnest（t.favorite_colors）颜色为TRUE 
 ...

与其他任何 INNER JOIN 一样，它会排除没有颜色的行（ favorite_colors IS NULL ）-与第一个查询一样。

要 include 结果，改为使用：

 选择组名，颜色，计数（*），如ct 
 FROM tbl t 
左加入横向unnest（t.favorite_colors）AS颜色为TRUE 
 GROUP BY 1,2 
 ORDER BY 1,3 DESC;

在下一步中，您可以轻松汇总每个组的最常见颜色，但是您可以首先需要定义最常用的颜色 ...

最常用的颜色

根据评论，选择颜色>出现次数超过3次。

  SELECT t.group_name，color，count（*）ct ct 
 FROM tbl t ，unnest（t.favorite_colors）AS颜色
 GROUP BY 1,2 
具有count（*）> 3 
订购1,3 DESC;

要汇总数组中的顶部颜色（降序排列）：

  SELECT group_name，array_agg（color）AS top_colors 
 FROM（
 SELECT group_name，color 
 FROM tbl t，unnest（ t.favorite_colors）AS color 
 GROUP BY 1,2 
 HAVING count（*）> 3 
 ORDER BY 1，count（*）DESC 
）sub 
 GROUP BY 1;

演示所有内容。

I have a table of rows with the following structure name TEXT, favorite_colors TEXT[], group_name INTEGER where each row has a list of everyone's favorite colors and the group that person belongs to. How can I GROUP BY group_name and return a list of the most common colors in each group?

Could you do a combination of int[] && int[] to set for overlap, int[] & int[] to get the intersection and then something else to count and rank?

解决方案

Quick and dirty:

SELECT group_name, color, count(*) AS ct
FROM (
   SELECT group_name, unnest(favorite_colors) AS color
   FROM   tbl
   ) sub
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

Better with a `LATERAL JOIN`

In Postgres 9.3 or later this is the cleaner form:

SELECT group_name, color, count(*) AS ct
FROM   tbl t, unnest(t.favorite_colors) AS color
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

The above is shorthand for

...
FROM tbl t
JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE
...

And like with any other INNER JOIN, it would exclude rows without color (favorite_colors IS NULL) - as did the first query.

To include such rows in the result, use instead:

SELECT group_name, color, count(*) AS ct
FROM   tbl t
LEFT   JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

You can easily aggregate the "most common" colors per group in the next step, but you'd need to define "most common colors" first ...

Most common colors

As per comment, pick colors with > 3 occurrences.

SELECT t.group_name, color, count(*) AS ct
FROM   tbl t, unnest(t.favorite_colors) AS color
GROUP  BY 1,2
HAVING count(*) > 3
ORDER  BY 1,3 DESC;

To aggregate the top colors in an array (in descending order):

SELECT group_name, array_agg(color) AS top_colors
FROM  (
   SELECT group_name, color
   FROM   tbl t, unnest(t.favorite_colors) AS color
   GROUP  BY 1,2
   HAVING count(*) > 3
   ORDER  BY 1, count(*) DESC
   ) sub
GROUP BY 1;

-> SQLfiddle demonstrating all.

这篇关于使用group by查找数组中最常见的元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

by查找数组中最常见的元素

问题描述

最好使用

最常用的颜色

Better with a LATERAL JOIN

Most common colors

Better with a `LATERAL JOIN`