我有许多遵循此相当常见模式的表:A <-->> B
。我想在表A中找到某些行具有相等值的匹配行对,并在表B中某些列具有相等值的地方引用行。换句话说,A中的一对行(R,S)匹配,当A中的给定列{a1,a2,…,an}和B中的{b1,b2,…,bn}匹配时:
我们有R.a1 = S.a1,R.a2 = S.a2,…,R.an =S.an。
对于B中每个R的引用行T,B s.t中存在S的引用行U。 T.b1 = U.b1,T.b2 = U.b2,…,T.bn =U.bn。
(R,S)匹配,如果(S,R)匹配。
(我对关系代数不是很熟悉,因此上面的定义可能没有遵循任何约定。)
我想出的方法是:
查找具有匹配列的对(R,S)。
查看B中R和S的引用行是否相等。
对于B中的每一行,找到匹配的行,将A中的引用行分组并计数。检查匹配行和引用行的数量是否相同。
但是,我为第2步和第3步编写的查询(下面),以查找B中的匹配行,该查询非常复杂。有更好的解决方案吗?
-- Tables similar to those that I have.
CREATE TABLE a (
id INTEGER PRIMARY KEY,
data TEXT
);
CREATE TABLE b (
id INTEGER PRIMARY KEY,
a_id INTEGER REFERENCES a (id),
data TEXT
);
SELECT DISTINCT dup.lhs_parent_id, dup.rhs_parent_id
FROM (
SELECT DISTINCT
MIN(lhs.a_id, rhs.a_id) AS lhs_parent_id, -- Normalize.
MAX(lhs.a_id, rhs.a_id) AS rhs_parent_id,
COUNT(*) AS count
FROM b lhs
INNER JOIN b rhs USING (data)
WHERE NOT (lhs.id = rhs.id OR lhs.a_id = rhs.a_id) -- Remove self-matching rows and duplicate values with the same parent.
GROUP BY lhs.a_id, rhs.a_id
) dup
INNER JOIN ( -- Check that lhs has the same number of rows.
SELECT
a_id AS parent_id,
COUNT(*) AS count
FROM b
GROUP BY a_id
) lhs_ct ON (
dup.lhs_parent_id = lhs_ct.parent_id AND
dup.count = lhs_ct.count
)
INNER JOIN ( -- Check that rhs has the same number of rows.
SELECT
a_id AS parent_id,
COUNT(*) AS count
FROM b
GROUP BY a_id
) rhs_ct ON (
dup.rhs_parent_id = rhs_ct.parent_id AND
dup.count = rhs_ct.count
);
-- Test data.
-- Expected query result is three rows with values (1, 2), (1, 3) and (2, 3) for a_id,
-- since the first three rows (with values 'row 1', 'row 2' and 'row 3')
-- have referencing rows, each of which has a matching pair. The fourth row
-- ('row 3') only has one referencing row with the value 'foo', so it doesn't have a
-- pair for the referenced rows with the value 'bar'.
INSERT INTO a (id, data) VALUES
(1, 'row 1'),
(2, 'row 2'),
(3, 'row 3'),
(4, 'row 4');
INSERT INTO b (id, a_id, data) VALUES
(1, 1, 'foo'),
(2, 1, 'bar'),
(3, 2, 'foo'),
(4, 2, 'bar'),
(5, 3, 'foo'),
(6, 3, 'bar'),
(7, 4, 'foo');
我正在使用SQLite。
最佳答案
要查找匹配的行和不同的行,使用INTERSECT和MINUS操作然后连接起来会更容易...
但是,当只有一个字段实际用于比较JOIN解决方案时,效果更好:
Select B1.A_Id, B2.A_Id
From (
Select Data, A_Id, Count(Id) A_Count
From B
Group By Data, A_Id
) b1
inner join (
Select Data, A_Id, Count(Id) a_count
From B Group By Data, A_Id
) b2 on b1.data = b2.data and b1.a_count = b2.a_count and b1.a_id <> b2.a_id
据我了解,您需要找出具有相同数据和数据计数的不同a_id对。
我的脚本的结果提供了可能在两个方向上的耦合,这为SQLlite特定语法的优化留有余地。
结果示例:
{1,2},{1,3},{2,1},{2,3},{3,2},{3,1}
关于sql - 测试引用行的相等性,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/24864036/