模块下的列排序的完整层次结构

模块下的列排序的完整层次结构

本文介绍了检索按 PostgreSQL 的 Ltree 模块下的列排序的完整层次结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 PostgreSQL 的 Ltree 模块来存储分层数据.我希望检索按特定列排序的完整层次结构.

I'm using PostgreSQL's Ltree module for storing hierarchical data. I'm looking to retrieve the full hierarchy sorted by a particular column.

考虑下表:

  votes | path  | ...
 -------+-------+-----
      1 | 1     | ...
      2 | 1.1   | ...
      4 | 1.2   | ...
      1 | 1.2.1 | ...
      3 | 2     | ...
      1 | 2.1   | ...
      2 | 2.1.1 | ...
      4 | 2.1.2 | ...
    ... | ...   | ...

在我当前的实现中,我使用 SELECT * FROM comments ORDER BY path 查询数据库,这将返回整个树:

In my current implementation, I'd query the database with SELECT * FROM comments ORDER BY path, which would return the whole tree:

Node 1
-- Node 1.1
-- Node 1.2
---- Node 1.2.1
Node 2
-- Node 2.1
---- Node 2.1.1
---- Node 2.1.2

但是,我想按 votes 排序(不是按 id,这就是按 path 排序的意思).每个深度级别都需要独立排序,正确的树结构保持完整.会返回以下内容:

However, I want to sort by votes (not by id, which is what sorting by path amounts to). Each depth level needs to be independently sorted, with the correct tree structure kept intact. Something that would return the following:

Node 2
-- Node 2.1
---- Node 2.1.2
---- Node 2.1.1
Node 1
-- Node 1.2
---- Node 1.2.1
-- Node 1.1

Postgres 的 WITH RECURSIVE 可能是合适的,但我不确定.有什么想法吗?

Postgres' WITH RECURSIVE might be appropriate, but I'm not sure. Any ideas?

推荐答案

带递归.

WITH RECURSIVE t AS (
    SELECT t.votes
         , t.path
         , 1::int AS lvl
         , to_char(t2.votes, 'FM0000000')  AS sort
    FROM   tbl t
    JOIN   tbl t2 ON t2.path = subltree(t.path, 0, 1)

    UNION ALL
    SELECT t.votes
         , t.path
         , t.lvl + 1
         , t.sort || to_char(t2.votes, 'FM0000000')
    FROM   t
    JOIN   tbl t2 ON t2.path = subltree(t.path, 0, t.lvl + 1)
    WHERE  nlevel(t.path) > t.lvl
    )
SELECT votes, path, max(sort) AS sort
FROM   t
GROUP  BY 1, 2
ORDER  BY max(sort), path;

要点

  • 关键部分是用votes的值替换路径的每一层.因此,我们组装了一列,我们可以在最后ORDER BY.这是必要的,因为路径的深度未知,我们无法在静态 SQL 中按未知数量的表达式进行排序.

    Major points

    • The crucial part is to replace every level of the path with the value of votes. Thereby we assemble one column we can ORDER BY at the end. This is necessary, because the path has an unknown depth and we cannot order by an unknown number of expressions in static SQL.

      为了获得稳定的排序,我使用 .我在演示中使用了七位数字,它适用于低于 10.000.000 的投票值.根据您的最大投票数进行调整.

      In order to get a stable sort, I convert votes to a string with leading zeroes using to_char(). I use seven digits in the demo, which works for vote values below 10.000.000. Adjust according to your maximum vote count.

      在最终的 SELECT 中,我排除了所有中间状态以消除重复项.只剩下带有 max(sort) 的最后一步.

      In the final SELECT I exclude all intermediary states to eliminate duplicates. Only the last step with max(sort) remains.

      这在具有递归 CTE 的标准 SQL 中有效,但对于大型树来说效率不高.以递归方式更新临时表中的排序路径而不创建临时重复项的 plpgsql 函数可能会执行得更好.

      This works in standard SQL with a recursive CTE, but is not very efficient for large trees. A plpgsql function that recursively updates the sort path in a temporary table without creating temporary dupes might perform better.

      仅适用于已安装的附加模块 ltree提供函数 subltree()nlevel(),以及 ltree 数据类型.

      Only works with the additional module ltree installed, which provides the functions subltree() and nlevel(), as well as the ltree data type.

      我的测试设置,为了方便查看:

      My test setup, for review convenience:

      CREATE TEMP TABLE tbl(votes int, path ltree);
      INSERT INTO tbl VALUES
        (1, '1')
      , (2, '1.1')
      , (4, '1.2')
      , (1, '1.2.1')
      , (3, '2')
      , (1, '2.1')
      , (2, '2.1.1')
      , (4, '2.1.2')
      , (1, '2.1.3')
      , (2, '3')
      , (17, '3.3')
      , (99, '3.2')
      , (10, '3.1.1')
      , (2345, '3.1.2')
      , (1, '3.1.3')
      ;
      

      PL/pgSQL 表函数做同样的事情

      大树应该更快.

      CREATE OR REPLACE FUNCTION f_sorted_ltree()
        RETURNS TABLE(votes int, path ltree)
        LANGUAGE plpgsql VOLATILE AS
      $func$
      DECLARE
         lvl integer := 0;
      BEGIN
         CREATE TEMP TABLE t ON COMMIT DROP AS
         SELECT tbl.votes
              , tbl.path
              , ''::text AS sort
              , nlevel(tbl.path) AS depth
         FROM   tbl;
      
         -- CREATE INDEX t_path_idx ON t (path);   -- beneficial for huge trees
         -- CREATE INDEX t_path_idx ON t (depth);
      
         LOOP
            lvl := lvl + 1;
      
            UPDATE t SET sort = t.sort || to_char(v.votes, 'FM0000000')
            FROM  (
               SELECT t2.votes, t2.path
               FROM   t t2
               WHERE  t2.depth = lvl
               ) v
            WHERE  v.path = subltree(t.path, 0 ,lvl);
      
            EXIT WHEN NOT FOUND;
         END LOOP;
      
         -- Return sorted rows
         RETURN QUERY
         SELECT t.votes, t.path
         FROM   t
         ORDER  BY t.sort;
      END
      $func$;
      

      调用:

      SELECT * FROM f_sorted_ltree();
      

      阅读关于设置temp_buffers.

      我很感兴趣,哪个在您的现实生活数据中表现得更快.

      I would be interested which performs faster with your real life data.

      这篇关于检索按 PostgreSQL 的 Ltree 模块下的列排序的完整层次结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-01 22:31