问题描述
我正在使用两个PostgreSQL 9.6数据库,并尝试使用postgres_fdw从另一个数据库中查询一个数据库(一个是具有数据的生产备份数据库,另一个是用于进行各种分析的数据库)。
I'm working with two PostgreSQL 9.6 databases and am trying to query one of the DB's from the other using postgres_fdw (one is a production backup DB that has the data and the other is a db for doing various analyses).
我遇到了一些奇怪的行为,尽管查询中的某些类型的WHERE子句没有传递给远程数据库,而是保留在本地数据库中并用于过滤从远程数据库收到的结果。这导致远程数据库尝试通过网络发送比本地数据库所需更多的信息,并且受影响的查询速度大大降低(15秒对15分钟)。
I've come across some odd behavior though where certain types of WHERE clauses in a query aren't being passed to the remote DB but instead are retained in the local DB and used to filter the results received from the remote DB. This is causing the remote DB to try to send way more info than the local DB needs across the network and the affected queries are dramatically slower (15 seconds vs 15 minutes).
我主要通过与时间戳相关的子句来了解这一点,下面的示例是我第一次遇到此问题的方式,但是我在其他一些变体中也看到了它,例如用TIMESTAMP文字(慢)或TIMESTAMP WITH TIME ZONE文字(快速)替换CURRENT_TIMESTAMP。
I've mostly seen this with timestamp related clauses, the below examples are how I first came across the problem, but I've seen it in several other variations such as replacing CURRENT_TIMESTAMP with a TIMESTAMP literal (slow) or TIMESTAMP WITH TIME ZONE literal (fast).
我缺少某处的设置可以对此有所帮助吗?我将其设置为团队使用混合级别的SQL背景,而大多数人都不具备审查EXPLAIN计划之类的经验。我已经提出了一些解决方法(例如将相对时间子句放在sub-SELECT中),但是我不断遇到新的问题实例。
Is there a setting somewhere that I'm missing that'll help with this? I'm setting this up for a team to use with a mixed level of SQL backgrounds, most aren't experienced with reviewing EXPLAIN plans and whatnot. I've come up with some workarounds (such as putting the relative time clauses in sub-SELECT), but I keep coming across new instances of the problem.
一个示例:
SELECT var_1
,var_2
FROM schema_A.table_A
WHERE execution_ts <= CURRENT_TIMESTAMP - INTERVAL '1 hour'
AND execution_ts >= CURRENT_TIMESTAMP - INTERVAL '1 week' - INTERVAL '1 hour'
ORDER BY var_1
说明计划
Sort (cost=147.64..147.64 rows=1 width=1048)
Output: table_A.var_1, table_A.var_2
Sort Key: (table_A.var_1)::text
-> Foreign Scan on schema_A.table_A (cost=100.00..147.63 rows=1 width=1048)
Output: table_A.var_1, table_A.var_2
Filter: ((table_A.execution_ts <= (now() - '01:00:00'::interval))
AND (table_A.execution_ts >= ((now() - '7 days'::interval) - '01:00:00'::interval)))
Remote SQL: SELECT var_1, execution_ts FROM model.table_A
WHERE ((model_id::text = 'ABCD'::text))
AND ((var_1 = ANY ('{1,2,3,4,5}'::bigint[])))
上面的代码大约需要15-20分钟才能运行,而下面的代码只需几秒钟即可完成。
The above takes around 15-20 minutes to run, while the below completes in seconds.
SELECT var_1
,var_2
FROM schema_A.table_A
WHERE execution_ts <= (SELECT CURRENT_TIMESTAMP - INTERVAL '1 hour')
AND execution_ts >= (SELECT CURRENT_TIMESTAMP - INTERVAL '1 week' - INTERVAL '1 hour')
ORDER BY var_1
说明计划
Sort (cost=158.70..158.71 rows=1 width=16)
Output: table_A.var_1, table_A.var_2
Sort Key: table_A.var_1
InitPlan 1 (returns $0)
-> Result (cost=0.00..0.01 rows=1 width=8)
Output: (now() - '01:00:00'::interval)
InitPlan 2 (returns $1)
-> Result (cost=0.00..0.02 rows=1 width=8)
Output: ((now() - '7 days'::interval) - '01:00:00'::interval)
-> Foreign Scan on schema_A.table_A (cost=100.00..158.66 rows=1 width=16)
Output: table_A.var_1, table_A.var_2
Remote SQL: SELECT var_1, var_2 FROM model.table_A
WHERE ((execution_ts <= $1::timestamp with time zone))
AND ((execution_ts >= $2::timestamp with time zone))
AND ((model_id::text = 'ABCD'::text))
AND ((var_1 = ANY ('{1,2,3,4,5}'::bigint[])))
推荐答案
任何不是 IMMUTABLE
的函数都不会
请参见 contrib / postgres_fdw / deparse.c is_foreign_expr
/ code>:
See function is_foreign_expr
in contrib/postgres_fdw/deparse.c
:
/*
* Returns true if given expr is safe to evaluate on the foreign server.
*/
bool
is_foreign_expr(PlannerInfo *root,
RelOptInfo *baserel,
Expr *expr)
{
...
/*
* An expression which includes any mutable functions can't be sent over
* because its result is not stable. For example, sending now() remote
* side could cause confusion from clock offsets. Future versions might
* be able to make this choice with more granularity. (We check this last
* because it requires a lot of expensive catalog lookups.)
*/
if (contain_mutable_functions((Node *) expr))
return false;
/* OK to evaluate on the remote server */
return true;
}
这篇关于Postgres_FDW不降低WHERE标准的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!