本文介绍了带有元素编号的 PostgreSQL unnest()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我有一个带有分隔值的列时,我可以使用 unnest() 函数:

When I have a column with separated values, I can use the unnest() function:

myTable
id | elements
---+------------
1  |ab,cd,efg,hi
2  |jk,lm,no,pq
3  |rstuv,wxyz

select id, unnest(string_to_array(elements, ',')) AS elem
from myTable

id | elem
---+-----
1  | ab
1  | cd
1  | efg
1  | hi
2  | jk
...

如何包含元素编号?即:

How can I include element numbers? I.e.:

id | elem | nr
---+------+---
1  | ab   | 1
1  | cd   | 2
1  | efg  | 3
1  | hi   | 4
2  | jk   | 1
...

我想要源字符串中每个元素的原始位置.我尝试过使用窗口函数(row_number()rank() 等),但我总是得到 1.也许是因为它们在源表的同一行?

I want the original position of each element in the source string. I've tried with window functions (row_number(), rank() etc.) but I always get 1. Maybe because they are in the same row of the source table?

我知道这是一个糟糕的桌子设计.这不是我的,我只是想修复它.

I know it's a bad table design. It's not mine, I'm just trying to fix it.

推荐答案

Postgres 9.4 或更高版本

使用WITH ORDINALITY 用于设置返回函数:

FROM 子句中的函数以WITH ORDINALITY 为后缀时,bigint 列附加到从 1 和对于函数输出的每一行,递增 1.这是最在设置返回函数的情况下很有用,例如 unnest().

结合 LATERAL 功能在 pg 9.3+ 中,并根据此 pgsql-hackers 上的线程,上面的查询现在可以写成:

In combination with the LATERAL feature in pg 9.3+, and according to this thread on pgsql-hackers, the above query can now be written as:

SELECT t.id, a.elem, a.nr
FROM   tbl AS t
LEFT   JOIN LATERAL unnest(string_to_array(t.elements, ','))
                    WITH ORDINALITY AS a(elem, nr) ON TRUE;

LEFT JOIN ... ON TRUE 保留左表中的所有行,即使右侧的表表达式不返回任何行.如果这无关紧要,您可以使用这种等效的更简洁形式,并带有隐式 CROSS JOIN LATERAL:

LEFT JOIN ... ON TRUE preserves all rows in the left table, even if the table expression to the right returns no rows. If that's of no concern you can use this otherwise equivalent, less verbose form with an implicit CROSS JOIN LATERAL:

SELECT t.id, a.elem, a.nr
FROM   tbl t, unnest(string_to_array(t.elements, ',')) WITH ORDINALITY a(elem, nr);

或者更简单,如果基于实际数组(arr 是一个数组列):

Or simpler if based off an actual array (arr being an array column):

SELECT t.id, a.elem, a.nr
FROM   tbl t, unnest(t.arr) WITH ORDINALITY a(elem, nr);

甚至,使用最少的语法:

Or even, with minimal syntax:

SELECT id, a, ordinality
FROM   tbl, unnest(arr) WITH ORDINALITY a;

a 自动是表列的别名.添加的序数列的默认名称是ordinality.但最好(更安全、更干净)添加显式列别名和表限定列.

a is automatically table and column alias. The default name of the added ordinality column is ordinality. But it's better (safer, cleaner) to add explicit column aliases and table-qualify columns.

使用 row_number() OVER (PARTITION BY id ORDER BY elem) 你根据排序顺序得到数字,而不是原始序数位置的序数字符串.

With row_number() OVER (PARTITION BY id ORDER BY elem) you get numbers according to the sort order, not the ordinal number of the original ordinal position in the string.

你可以简单地省略ORDER BY:

SELECT *, row_number() OVER (PARTITION by id) AS nr
FROM  (SELECT id, regexp_split_to_table(elements, ',') AS elem FROM tbl) t;

虽然这通常有效并且我从未见过它在简单查询中失败,但 PostgreSQL 不会在没有 ORDER BY 的情况下断言与行顺序有关的任何内容.由于实现细节,它恰好可以工作.

While this normally works and I have never seen it fail in simple queries, PostgreSQL asserts nothing concerning the order of rows without ORDER BY. It happens to work due to an implementation detail.

为了保证以空格分隔的字符串中元素的序号:

SELECT id, arr[nr] AS elem, nr
FROM  (
   SELECT *, generate_subscripts(arr, 1) AS nr
   FROM  (SELECT id, string_to_array(elements, ' ') AS arr FROM tbl) t
   ) sub;

或者更简单,如果基于实际数组:

Or simpler if based off an actual array:

SELECT id, arr[nr] AS elem, nr
FROM  (SELECT *, generate_subscripts(arr, 1) AS nr FROM tbl) t;

dba.SE 上的相关回答:

Related answer on dba.SE:

这些功能都不可用,但:RETURNS TABLE, generate_subscripts(), 取消嵌套(), array_length().但这有效:

None of these features are available, yet: RETURNS TABLE, generate_subscripts(), unnest(), array_length(). But this works:

CREATE FUNCTION f_unnest_ord(anyarray, OUT val anyelement, OUT ordinality integer)
  RETURNS SETOF record
  LANGUAGE sql IMMUTABLE AS
'SELECT $1[i], i - array_lower($1,1) + 1
 FROM   generate_series(array_lower($1,1), array_upper($1,1)) i';

特别注意,数组索引可以不同于元素的序数位置.考虑这个具有扩展功能的演示:

Note in particular, that the array index can differ from ordinal positions of elements. Consider this demo with an extended function:

CREATE FUNCTION f_unnest_ord_idx(anyarray, OUT val anyelement, OUT ordinality int, OUT idx int)
  RETURNS SETOF record
  LANGUAGE sql IMMUTABLE AS
'SELECT $1[i], i - array_lower($1,1) + 1, i
 FROM   generate_series(array_lower($1,1), array_upper($1,1)) i';

SELECT id, arr, (rec).*
FROM  (
   SELECT *, f_unnest_ord_idx(arr) AS rec
   FROM  (VALUES (1, '{a,b,c}'::text[])  --  short for: '[1:3]={a,b,c}'
               , (2, '[5:7]={a,b,c}')
               , (3, '[-9:-7]={a,b,c}')
      ) t(id, arr)
   ) sub;

 id |       arr       | val | ordinality | idx
----+-----------------+-----+------------+-----
  1 | {a,b,c}         | a   |          1 |   1
  1 | {a,b,c}         | b   |          2 |   2
  1 | {a,b,c}         | c   |          3 |   3
  2 | [5:7]={a,b,c}   | a   |          1 |   5
  2 | [5:7]={a,b,c}   | b   |          2 |   6
  2 | [5:7]={a,b,c}   | c   |          3 |   7
  3 | [-9:-7]={a,b,c} | a   |          1 |  -9
  3 | [-9:-7]={a,b,c} | b   |          2 |  -8
  3 | [-9:-7]={a,b,c} | c   |          3 |  -7

比较:

这篇关于带有元素编号的 PostgreSQL unnest()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-06 02:46