postgresql - 类似于row_number的方法，每个组从0重新开始

假设有一个表存储这样的层次结构：

item_id | hierarchical_id
--------+-----------------
    1   | ;1;
    2   | ;1;2;
    3   | ;1;2;3;
    4   | ;1;2;4;
    5   | ;1;2;4;5;

这里存储的层次结构是1作为根，2是1的子级，3和4是2的子级，5是4的子级。
询问

SELECT
  -- the substr is used to remove the first and last semicolumns
  regexp_split_to_table(substr(hierarchical_id, 2, length(hierarchical_id) - 2)
                        , E';'
  ) as parent_id,
  item_id,
  hierarchical_id
FROM
  table

回报

parent_id | item_id | hierarchical_id
----------+---------+-----------------
       1  |    1    | ;1;
       1  |    2    | ;1;2;
       2  |    2    | ;1;2;
       1  |    3    | ;1;2;3;
       3  |    3    | ;1;2;3;
       1  |    4    | ;1;2;3;
       2  |    4    | ;1;2;4;
       4  |    4    | ;1;2;4;
       1  |    5    | ;1;2;4;5;
       2  |    5    | ;1;2;4;5;
       4  |    5    | ;1;2;4;5;
       5  |    5    | ;1;2;4;5;

如何修改查询以获得第4列，如下所示：

parent_id | item_id | hierarchical_id | distance
----------+---------+-----------------+---------
       1  |    1    | ;1;             | 0
       1  |    2    | ;1;2;           | 1
       2  |    2    | ;1;2;           | 0
       1  |    3    | ;1;2;3;         | 2
       2  |    3    | ;1;2;3;         | 1
       3  |    3    | ;1;2;3;         | 0
       1  |    4    | ;1;2;4;         | 2
       2  |    4    | ;1;2;4;         | 1
       4  |    4    | ;1;2;4;         | 0
       1  |    5    | ;1;2;4;5;       | 3
       2  |    5    | ;1;2;4;5;       | 2
       4  |    5    | ;1;2;4;5;       | 1
       5  |    5    | ;1;2;4;5;       | 0

distance的含义是当前行上item_id和parent_id之间的距离。例如：节点与自身之间的距离为0，节点与其父节点之间的距离为1，节点与其父节点之间的距离为2等，不必从0开始。
row_number可以正常工作，如果我能让它在0重新启动，每组的item_ids相等，因为hierarchical_id中的id是有序的。
有什么建议吗？

最佳答案

窗口函数提供了很多控制；请参见4.2.8. Window Function Calls。
你需要的关键是：

row_number() OVER (PARTITON BY item_id ORDER BY hierarchical_id)

给定数据：

create table t ( item_id integer, hierarchical_id text );
insert into t (item_id, hierarchical_id) values
(1,';1;'),
(2,';1;2;'),
(3,';1;2;3;'),
(4,';1;2;4;'),
(5,';1;2;4;5;');

查询：

WITH x AS (
  SELECT regexp_split_to_table(substr(hierarchical_id, 2, length(hierarchical_id) - 2), E';') as parent_id,
    item_id,
    hierarchical_id
  FROM t
)
SELECT
  *,
  row_number() OVER (PARTITION BY item_id ORDER BY parent_id DESC) - 1 AS distance
FROM x
ORDER BY item_id, parent_id;

生产：

 parent_id | item_id | hierarchical_id | distance
-----------+---------+-----------------+----------
 1         |       1 | ;1;             |        0
 1         |       2 | ;1;2;           |        1
 2         |       2 | ;1;2;           |        0
 1         |       3 | ;1;2;3;         |        2
 2         |       3 | ;1;2;3;         |        1
 3         |       3 | ;1;2;3;         |        0
 1         |       4 | ;1;2;4;         |        2
 2         |       4 | ;1;2;4;         |        1
 4         |       4 | ;1;2;4;         |        0
 1         |       5 | ;1;2;4;5;       |        3
 2         |       5 | ;1;2;4;5;       |        2
 4         |       5 | ;1;2;4;5;       |        1
 5         |       5 | ;1;2;4;5;       |        0

这看起来大致正确，但由于您的预期输出似乎与我运行它时提供的查询输出（第9.1页）不匹配，因此很难确定。

关于postgresql - 类似于row_number的方法，每个组从0重新开始，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/12835025/