重用计算出的选择值

本文介绍了重用计算出的选择值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 ST_SnapToGrid ，然后使用GROUP BY网格单元(x， y).这是我首先要做的:

I'm trying to use ST_SnapToGrid and then GROUP BY the grid cells (x, y). Here is what I did first:

SELECT
  COUNT(*)                      AS n,
  ST_X(ST_SnapToGrid(geom, 50)) AS x,
  ST_Y(ST_SnapToGrid(geom, 50)) AS y
FROM points
GROUP BY x, y

我不想为x和y都重新计算ST_SnapToGrid.所以我将其更改为使用子查询:

I don't want to recompute ST_SnapToGrid for both x and y. So I changed it to use a sub-query:

SELECT
  COUNT(*)   AS n,
  ST_X(geom) AS x,
  ST_Y(geom) AS y
FROM (
  SELECT
      ST_SnapToGrid(geom, 50) AS geom
  FROM points
) AS tmp
GROUP BY x, y

但是当我运行EXPLAIN时，这两个查询都具有完全相同的执行计划:

But when I run EXPLAIN, both of these queries have the exact same execution plan:

GroupAggregate  (...)
  ->  Sort  (...)
        Sort Key: (st_x(st_snaptogrid(points.geom, 0::double precision))), (st_y(st_snaptogrid(points.geom, 0::double precision)))
        ->  Seq Scan on points  (...)

问题:PostgreSQL是否会重用ST_SnapToGrid()的结果值?
如果没有，是否有办法做到这一点?

Question: Will PostgreSQL reuse the result value of ST_SnapToGrid()?
If not, is there a way to make it do this?

测试时间

在EXPLAIN输出中看不到每行各个功能的评估.

Test timing

You don't see the evaluation of individual functions per row in the EXPLAIN output.

使用EXPLAIN ANALYZE测试以获取实际查询时间以比较总体效果.运行几次以排除缓存工件.对于像这样的简单查询，您可以通过以下方式获得整个运行时的更可靠的数字:

Test with EXPLAIN ANALYZE to get actual query times to compare overall effectiveness. Run a couple of times to rule out caching artifacts. For simple queries like this, you get more reliable numbers for the total runtime with:

EXPLAIN (ANALYZE, TIMING OFF) SELECT ...

需要Postgres 9.2 + . 每个文档:

Requires Postgres 9.2+. Per documentation:

在输出中包括实际的启动时间和在每个节点上花费的时间.重复读取系统时钟的开销可能会减慢在某些系统上大大降低了查询量，因此对于仅当实际行计数而不精确时，将此参数设置为FALSE 时间，是必要的.整个语句的运行时间始终是即使使用此选项关闭了节点级时序，也可以进行测量. 仅当还启用了ANALYZE时，才可以使用此参数.它默认为TRUE.

Include actual startup time and time spent in each node in the output. The overhead of repeatedly reading the system clock can slow down the query significantly on some systems, so it may be useful to set this parameter to FALSE when only actual row counts, and not exact times, are needed. Run time of the entire statement is always measured, even when node-level timing is turned off with this option. This parameter may only be used when ANALYZE is also enabled. It defaults to TRUE.

防止重复评估

通常，子查询中的表达式一次评估一次.但是，如果Postgres认为这样做会更快，那么它可以使琐碎的子查询崩溃.

Prevent repeated evaluation

Normally, expressions in a subquery are evaluated once. But Postgres can collapse trivial subqueries if it thinks that will be faster.

要引入优化障碍，您可以使用 CTE 而不是子查询. Postgres仅计算一次ST_SnapToGrid(geom, 50)的保证:

To introduce an optimization barrier, you could use a CTE instead of the subquery. This guarantees that Postgres computes ST_SnapToGrid(geom, 50) once only:

WITH cte AS (
   SELECT ST_SnapToGrid(geom, 50) AS geom1
   FROM   points
   )
SELECT COUNT(*)   AS n
     , ST_X(geom1) AS x
     , ST_Y(geom1) AS y
FROM   cte
GROUP  BY geom1;         -- see below

但是，由于CTE的开销较大，因此它可能比子查询慢 .函数调用可能非常便宜.通常，Postgres会更好地了解如何优化查询计划.如果您更了解，请仅引入这样的优化障碍.

However, this it's probably slower than a subquery due to more overhead for a CTE. The function call is probably very cheap. Generally, Postgres knows better how to optimize a query plan. Only introduce such an optimization barrier if you know better.

我将子查询/CTE中计算点的名称更改为geom1，以阐明其与原始geom的不同.这有助于在此处阐明更重要的内容:

I changed the name of the computed point in the subquery / CTE to geom1 to clarify it's different from the original geom. That helps to clarify the more important thing here:

GROUP BY geom1

代替:

这显然更便宜-可能会对是否重复调用函数产生影响.因此，这可能是最快的:

That's obviously cheaper - and may have an influence on whether the function call is repeated. So, this is probably fastest:

SELECT COUNT(*) AS n
     , ST_X(ST_SnapToGrid(geom, 50)) AS x
     , ST_y(ST_SnapToGrid(geom, 50)) AS y
FROM   points
GROUP  BY ST_SnapToGrid(geom, 50);         -- same here!

或者也许是这样

SELECT COUNT(*)    AS n
     , ST_X(geom1) AS x
     , ST_y(geom1) AS y
FROM (
   SELECT ST_SnapToGrid(geom, 50) AS geom1
   FROM   points
   ) AS tmp
GROUP  BY geom1;

使用EXPLAIN ANALYZE或EXPLAIN (ANALYZE, TIMING OFF)测试所有三个，然后亲自查看.测试>>猜测.

Test all three with EXPLAIN ANALYZE or EXPLAIN (ANALYZE, TIMING OFF) and see for yourself. Testing >> guessing.

这篇关于重用计算出的选择值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

OFF