问题描述
我正在尝试编写一些 Oracle 11g SQL,但遇到了一些先有鸡还是先有蛋的问题.我正在寻找类似电子表格的行为.我找到了一个确实使用 Oracle 的 MODEL
子句的解决方案,但性能并不好.所以我想知道非MODEL
"解决方案在技术上是否可行.
I'm trying to write some Oracle 11g SQL, but I'm running into a bit of a chicken-and-egg problem. I'm looking for spreadsheet-like behavior. I've found a solution that does use Oracle's MODEL
clause, but the performance isn't great. So I'm wondering if a "non-MODEL
" solution is even technically feasible.
这是一个演示我正在尝试做的事情的玩具示例.鉴于此表:
Here's a toy example that demonstrates what I'm trying to do. Given this table:
CREATE TABLE t (id NUMBER PRIMARY KEY, n NUMBER);
INSERT INTO t (id, n) VALUES (2, 0);
INSERT INTO t (id, n) VALUES (3, 1);
INSERT INTO t (id, n) VALUES (5, 1);
INSERT INTO t (id, n) VALUES (7, 2);
INSERT INTO t (id, n) VALUES (11, 3);
INSERT INTO t (id, n) VALUES (13, 5);
INSERT INTO t (id, n) VALUES (17, 8);
INSERT INTO t (id, n) VALUES (19, 13);
我想计算两个额外的派生列,将它们称为 X
和 Y
.
I want to compute two additional, derived columns, call them X
and Y
.
以下是如何计算 X
和 Y
的规则:
Here are the rules for how X
and Y
are to be computed:
X:对于第一行,由 ID 的最小值定义,将 X
设置为 N
.对于所有后续行,X
的值应比前一个 Y
的值小 1,按 ID
排序.
你:两次 N
加上 X
.
接下来的几个步骤展示了如果我要手动执行此操作,我将如何填写所需的视图.首先,给定数据的前几行:
These next few steps show how I'd fill out my desired view if I were to do this by hand. First, the first few rows of the given data:
ID N X Y
--- --- --- ---
2 0
3 1
5 1
7 2
....
由于我们在第一行,X
应该设置为 N
或 0
.Y
应该是 2 * N + X
,或者 0
.
Since we're in the first row, X
should be set to N
, or 0
. Y
should be 2 * N + X
, or 0
.
ID N X Y
--- --- --- ---
2 0 0 0
3 1
5 1
7 2
....
现在,由于我们不再位于第一行,从现在开始,X
应该总是比前一行的 Y
小一个.在第二行,这意味着 X
= (previous Y
) - 1
= 0 - 1
= .第二行的 Y
将是 2 * N + X
,或 2 * (1) + (-1)
= 1
.
Now, since we're not in the first row any longer, X
should always be one less than the previous row's Y
from here on out. Here in the second row, that means X
= (previous Y
) - 1
= 0 - 1
= -1
. And the second row's Y
will be 2 * N + X
, or 2 * (1) + (-1)
= 1
.
ID N X Y
--- --- --- ---
2 0 0 0
3 1 -1 1
5 1
7 2
....
如果你继续数学,这里是你想要的结果:
If you continue with the math, here's the desired outcome:
ID N X Y
--- --- --- ---
2 0 0 0
3 1 -1 1
5 1 0 2
7 2 1 5
11 3 4 10
13 5 9 19
17 8 18 34
19 13 33 59
给定如何计算 X
和 Y
的规则,是否可以在不必求助于 MODEL
子句的情况下获得此结果?
Given the rules for how X
and Y
are computed, is it possible to get this outcome without having to resort to the MODEL
clause?
我不是在寻找基于此特定示例的数学简化;这只是我想出的一个玩具示例,它展示了我在实际问题中面临的那种相互依赖.
I'm not looking for a mathematical simplification based on this particular example; this is just a toy example I came up with that demonstrates the kind of interdependence I'm facing in my actual problem.
P.S.:这是一个 MODEL
示例,我可以拼凑起来生成此输出;也许有可能进行修改以提高性能?
P.S.: Here's a MODEL
example I was able to cobble together that does generate this output; maybe there are modifications possible to improve performance?
SQL> WITH u AS (
2 SELECT ROW_NUMBER() OVER (ORDER BY t.id) r
3 , t.id
4 , t.n
5 FROM t
6 )
7 SELECT r
8 , id
9 , n
10 , x
11 , y
12 FROM u
13 MODEL
14 DIMENSION BY (r)
15 MEASURES (id
16 , n
17 , CAST(NULL AS NUMBER) x
18 , CAST(NULL AS NUMBER) y) RULES AUTOMATIC ORDER
19 ( x[1] = n[cv()]
20 , y[r] = 2 * n[cv()] + x[cv()]
21 , x[r > 1] ORDER BY r = y[cv() - 1] - 1
22 )
23 ;
R ID N X Y
---------- ---------- ---------- ---------- ----------
1 2 0 0 0
2 3 1 -1 1
3 5 1 0 2
4 7 2 1 5
5 11 3 4 10
6 13 5 9 19
7 17 8 18 34
8 19 13 33 59
8 rows selected.
SQL>
谢谢.
推荐答案
您可以使用 递归子查询分解(也称为递归 CTE):
You could use recursive subquery factoring (also known as a recursive CTE):
with tmp as (
select t.*,
row_number() over (order by t.id) as rn
from t
),
r (id, n, x, y, rn) as (
select id, n, 0, 0, rn
from tmp
where rn = 1
union all
select tmp.id, tmp.n, r.y - 1, (tmp.n * 2) + r.y - 1, tmp.rn
from r
join tmp on tmp.rn = r.rn + 1
)
select id, n, x, y
from r
order by rn;
ID N X Y
---------- ---------- ---------- ----------
2 0 0 0
3 1 -1 1
5 1 0 2
7 2 1 5
11 3 4 10
13 5 9 19
17 8 18 34
19 13 33 59
它基本上是通过您的手动步骤.锚成员是您的第一个手动步骤,将第一行的 x
和 y
都设置为零.然后递归成员执行您指定的计算.(在计算该行的 y
时,您不能引用新计算的 x
值,因此您必须将其重复为 (tmp.n * 2)+ ry - 1
).rn
只是按照 ID 保持行的顺序,同时更容易找到下一行 - 所以你可以寻找 rn + 1
而不是找到下一个最高的 ID直接取值.
It's basically walking through your manual steps. The anchor member is your first manual step, setting x
and y
both to zero for the first row. The recursive member then does the calculation you specified. (You can't refer to the new-calculated x
value when calculating that row's y
, so you have to repeat that as (tmp.n * 2) + r.y - 1
). The rn
is just to keep the rows orders by ID while making it easier to find the next row - so you can look for rn + 1
instead of find the next highest ID value directly.
您的示例数据没有显着的性能差异,但是添加了一千行后,模型子句大约需要 5 秒,递归 CTE 大约需要 1 秒;另一千行模型需要约 20 秒,CTE 需要约 3 秒;另一千行模型需要约 40 秒,CTE 需要约 6 秒;再有一千行(总共 4,008 行)模型用了大约 75 秒,CTE 用了大约 10 秒.(我厌倦了等待具有更多行的模型版本;五分钟后用 10,000 杀死了它).我真的不能说这对您的真实数据的表现如何,但在此基础上,它可能值得一试.
There's no significant performance difference with your sample data, but with a thousand rows added the model clause takes about 5 seconds and the recursive CTE takes about 1 second; with another thousand rows model takes ~20 seconds and the CTE takes ~3 seconds; with another thousand rows model took ~40 seconds and CTE took ~6 seconds; and with another thousand rows (so 4,008 in total) model took ~75 seconds and CTE took ~10 seconds. (I got bored waiting for the model version with more rows than that; killed it after a five minutes with 10,000). I can't really say how this will perform with your real data, but on that basis, it's probably worth trying.
这篇关于视图需要相互依赖的逻辑:没有模型可能吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!