假设我有记录:
======= =========
Element id
======= =========
"H"
"O"
还有一个类似的:
======== ==
Compound id
======== ==
"Water"
使用:
======== == =========== ========== ==========
Relation id compound_id element_id bond
======== == =========== ========== ==========
1 "Water" "H" "Covalent"
2 "Water" "H" "Covalent"
3 "Water" "O" "Covalent"
现在,我的大多数查询不是为了精确匹配,但有时我想找到具有精确元素=CC>的化合物(即水),而不是氢氧化物(
["H", "H", "O"]
)或过氧化物(["H", "O"]
)。我该怎么办?
Consensus seems to have it,在SQL中存储数组的最佳方法是通过多到多个中间表。
然而,没有数组的querying for an exact match看起来很慢也很复杂,即使有数据库特定的函数,比如
["H", "H", "O", "O"]
。 最佳答案
最好保持数据库的规范化。在您的特殊情况下,我将存储每个化合物的元素数,而不是为每个元素创建一个新行。
compound_id element_id bond count
-------------------------------------------------
"Water" "H" "Covalent" 2
"Water" "O" "Covalent" 1
精确匹配的查询将是
select compound_id
from elements
group by compound_id
having count(
case when
(element_id = 'H' and count = 2) or
(element_id = 'O' and count = 1) then 1
end
) = count(*)
然而,这种方法将是次优的,因为将使用顺序扫描。如果非规范化不是问题,那么为每个化合物存储多个不同的元素会有帮助。
compound_id element_count
------------------------------
"Water" 2
那么查询可能是
select e.compound_id
from elements e
join compounds c on e.compound_id = c.compound_id
where c.element_count = 2 and
((e.element_id = 'H' and e.count = 2) or
(e.element_id = 'O' and e.count = 1))
group by e.compound_id
having count(*) = 2
如果在
compounds(element_count)
和elements(element_id, count)
上有一个索引,那么即使数据库很大,查询也将使用它快速检索结果。关于mysql - SQL-多对多替代方案?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/54307056/