我在想办法把数据库访问抽象到postgres上。在我的示例中,我将在nodejs中使用一个假设的twitter克隆,但最后的问题是postgres如何处理准备好的语句,因此语言和库并不重要:
假设我想通过用户名访问一个用户的所有tweets列表:
name: "tweets by username"
text: "SELECT (SELECT * FROM tweets WHERE tweets.user_id = users.user_id) FROM users WHERE users.username = $1"
values: [username]
这很好,但似乎效率很低,无论是从实用的角度还是代码质量的角度来看,都必须创建另一个函数来处理通过电子邮件而不是用户名获取tweets:
name: "tweets by email"
text: "SELECT (SELECT * FROM tweets WHERE tweets.user_id = users.user_id) FROM users WHERE users.email = $1"
values: [email]
是否可以在准备好的语句中包含一个字段作为参数?
name: "tweets by user"
text: "SELECT (SELECT * FROM tweets WHERE tweets.user_id = users.user_id) FROM users WHERE users.$1 = $2"
values: [field, value]
当然,在用户id访问tweets的情况下,这样做的效率可能会低一些,但我愿意做这样的交易来提高代码质量,并希望通过将查询模板的数量减少到1而不是3+来全面提高效率。
最佳答案
@Clodoaldo的答案是正确的,因为它允许您想要的功能,并且应该返回正确的结果。不幸的是,它产生了相当缓慢的执行。
我用tweets和用户建立了一个实验性的数据库。每个用户拥有10万条tweets(100万条tweet记录)。我索引了PKs u.id,t.id,FK t.user_id和谓词字段u.username,u.email。
create table t(id serial PRIMARY KEY, data integer, user_id bignit);
create index t1 t(user_id);
create table u(id serial PRIMARY KEY, name text, email text);
create index u1 on u(name);
create index u2 on u(email);
insert into u(name,email) select i::text, i::text from generate_series(1,10000) i;
insert into t(data,user_id) select i, (i/100)::bigint from generate_series(1,1000000) i;
analyze table t;
analyze table u;
使用一个字段作为谓词的简单查询非常快速:
prepare qn as select t.* from t join u on t.user_id = u.id where u.name = $1;
explain analyze execute qn('1111');
Nested Loop (cost=0.00..19.81 rows=1 width=16) (actual time=0.030..0.057 rows=100 loops=1)
-> Index Scan using u1 on u (cost=0.00..8.46 rows=1 width=4) (actual time=0.020..0.020 rows=1 loops=1)
Index Cond: (name = $1)
-> Index Scan using t1 on t (cost=0.00..10.10 rows=100 width=16) (actual time=0.007..0.023 rows=100 loops=1)
Index Cond: (t.user_id = u.id)
Total runtime: 0.093 ms
where as@Clodoaldo提议的查询用例需要大约30秒:
prepare qen as select t.* from t join u on t.user_id = u.id
where case $2 when 'e' then u.email = $1 when 'n' then u.name = $1 end;
explain analyze execute qen('1111','n');
Merge Join (cost=25.61..38402.69 rows=500000 width=16) (actual time=27.771..26345.439 rows=100 loops=1)
Merge Cond: (t.user_id = u.id)
-> Index Scan using t1 on t (cost=0.00..30457.35 rows=1000000 width=16) (actual time=0.023..17.741 rows=111200 loops=1)
-> Index Scan using u_pkey on u (cost=0.00..42257.36 rows=500000 width=4) (actual time=0.325..26317.384 rows=1 loops=1)
Filter: CASE $2 WHEN 'e'::text THEN (u.email = $1) WHEN 'n'::text THEN (u.name = $1) ELSE NULL::boolean END
Total runtime: 26345.535 ms
通过观察这个计划,我认为使用union子选择,然后过滤其结果以获得适合参数化谓词选择的id,将允许计划器为每个谓词使用特定的索引。结果证明我是对的:
prepare qen2 as
select t.*
from t
join (
SELECT id from
(
SELECT 'n' as fld, id from u where u.name = $1
UNION ALL
SELECT 'e' as fld, id from u where u.email = $1
) poly
where poly.fld = $2
) uu
on t.user_id = uu.id;
explain analyze execute qen2('1111','n');
Nested Loop (cost=0.00..28.31 rows=100 width=16) (actual time=0.058..0.120 rows=100 loops=1)
-> Subquery Scan poly (cost=0.00..16.96 rows=1 width=4) (actual time=0.041..0.073 rows=1 loops=1)
Filter: (poly.fld = $2)
-> Append (cost=0.00..16.94 rows=2 width=4) (actual time=0.038..0.070 rows=2 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=0.00..8.47 rows=1 width=4) (actual time=0.038..0.038 rows=1 loops=1)
-> Index Scan using u1 on u (cost=0.00..8.46 rows=1 width=4) (actual time=0.038..0.038 rows=1 loops=1)
Index Cond: (name = $1)
-> Subquery Scan "*SELECT* 2" (cost=0.00..8.47 rows=1 width=4) (actual time=0.031..0.032 rows=1 loops=1)
-> Index Scan using u2 on u (cost=0.00..8.46 rows=1 width=4) (actual time=0.030..0.031 rows=1 loops=1)
Index Cond: (email = $1)
-> Index Scan using t1 on t (cost=0.00..10.10 rows=100 width=16) (actual time=0.015..0.028 rows=100 loops=1)
Index Cond: (t.user_id = poly.id)
Total runtime: 0.170 ms
关于postgresql - Postgres准备的具有不同字段的语句,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/9754372/