我有一组表,它们定义了一些需要遵循的规则,例如:
CREATE TABLE foo.subrules (
subruleid SERIAL PRIMARY KEY,
ruleid INTEGER REFERENCES foo.rules(ruleid),
subrule INTEGER,
barid INTEGER REFERENCES foo.bars(barid)
);
INSERT INTO foo.subrules(ruleid,subrule,barid) VALUES
(1,1,1),
(1,1,2),
(1,2,2),
(1,2,3),
(1,2,4),
(1,3,3),
(1,3,4),
(1,3,5),
(1,3,6),
(1,3,7);
这定义的是一组需要满足的“子规则”。。。如果满足所有“子规则”,则也满足规则。
在上面的例子中,“subruleid”
1
可以满足“barid”值1
或2
。此外,“subruleid”
2
可以满足“barid”值2
、3
或4
。同样,“subruleid”
3
可以满足“barid”值3
、4
、5
、6
或7
。我还有一个数据集如下所示:
primarykey | resource | barid
------------|------------|------------
1 | A | 1
2 | B | 2
3 | C | 8
棘手的是,一旦“子规则”满足了“资源”,那么“资源”就不能满足任何其他“子规则”(即使相同的“barid”满足了其他“子规则”)
所以,我需要评估并返回以下结果:
ruleid | subrule | barid | primarykey | resource
------------|------------|------------|------------|------------
1 | 1 | 1 | 1 | A
1 | 1 | 2 | NULL | NULL
1 | 2 | 2 | 2 | B
1 | 2 | 3 | NULL | NULL
1 | 2 | 4 | NULL | NULL
1 | 3 | 3 | NULL | NULL
1 | 3 | 4 | NULL | NULL
1 | 3 | 5 | NULL | NULL
1 | 3 | 6 | NULL | NULL
1 | 3 | 7 | NULL | NULL
NULL | NULL | NULL | 3 | C
有趣的是,如果“primarykey”
3
的“barid”值为2
(而不是8
),则结果将是相同的。我尝试了多种方法,包括一个
plpgsql
函数,该函数使用ARRAY_AGG(barid)
按“subruleid”执行分组,并从barid
构建一个数组,通过循环检查barid
数组中的每个元素是否在“subruleid”组中,但感觉不太对劲。是否有更优雅或更高效的选择?
最佳答案
下面的片段找到了解决方案(如果有的话)。第三个(资源)是硬编码的。如果只需要一个解,就应该加上一个对称破缺器。
如果资源的数量没有限制,我认为可以通过枚举所有可能的tableaux(Hilbert?混合基?),并从中挑选,修剪后不满意的。
-- the data
CREATE TABLE subrules
( subruleid SERIAL PRIMARY KEY
, ruleid INTEGER -- REFERENCES foo.rules(ruleid),
, subrule INTEGER
, barid INTEGER -- REFERENCES foo.bars(barid)
);
INSERT INTO subrules(ruleid,subrule,barid) VALUES
(1,1,1), (1,1,2),
(1,2,2), (1,2,3), (1,2,4),
(1,3,3), (1,3,4), (1,3,5), (1,3,6), (1,3,7);
CREATE TABLE resources
( primarykey INTEGER NOT NULL PRIMARY KEY
, resrc varchar
, barid INTEGER NOT NULL
);
INSERT INTO resources(primarykey,resrc,barid) VALUES
(1, 'A', 1) ,(2, 'B', 2) ,(3, 'C', 8)
-- ################################
-- uncomment next line to find a (two!) solution(s)
-- ,(4, 'D', 7)
;
-- all matching pairs of subrules <--> resources
WITH pairs AS (
SELECT sr.subruleid, sr.ruleid, sr.subrule, sr.barid
, re.primarykey, re.resrc
FROM subrules sr
JOIN resources re ON re.barid = sr.barid
)
SELECT
p1.ruleid AS ru1 , p1.subrule AS sr1 , p1.resrc AS one
, p2.ruleid AS ru2 , p2.subrule AS sr2 , p2.resrc AS two
, p3.ruleid AS ru3 , p3.subrule AS sr3 , p3.resrc AS three
-- self-join the pairs, excluding the ones that
-- use the same subrule or resource
FROM pairs p1
JOIN pairs p2 ON p2.primarykey > p1.primarykey -- tie-breaker
JOIN pairs p3 ON p3.primarykey > p2.primarykey -- tie breaker
WHERE 1=1
AND p2.subruleid <> p1.subruleid
AND p2.subruleid <> p3.subruleid
AND p3.subruleid <> p1.subruleid
;
结果(在取消注释缺少资源的行之后):
ru1 | sr1 | one | ru2 | sr2 | two | ru3 | sr3 | three
-----+-----+-----+-----+-----+-----+-----+-----+-------
1 | 1 | A | 1 | 1 | B | 1 | 3 | D
1 | 1 | A | 1 | 2 | B | 1 | 3 | D
(2 rows)
资源{A,B,C}当然可以硬编码,但这会阻止“D”记录(或任何其他记录)充当丢失的链接。
关于sql - PostgreSQL 9.3-比较两组数据而不重复第一组中的值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/28468458/