问题描述
我正在考虑一个应用程序的设计,主要功能围绕着查找给定集合子集的所有集合的集合的能力。
I'm pondering the design of an application where the main feature revolves around the ability to find the set of all sets which are subsets of a given set.
例如,给定输入集合A = {1,2,3 ... 50}和集合集合B = {B1 = {3,5,9,12},B2 = {1,6,100,123,45}。 .. B500 = {8,67,450}},返回所有B的A的子集。
For example, given the input set A={1,2,3...50} and the set of sets B={ B1={3,5,9,12}, B2={1,6,100,123,45} ... B500={8,67,450} }, return all Bs which are a subset of A.
我猜这是一个搜索引擎,除了我不真的有A套的奢侈品很小,而且B很大;在我的情况下,Bs通常小于A。
I guess it's similar to a search engine, except that I don't really have the luxury of set A being small and the Bs being large; in my case Bs are usually smaller than A.
我发现了一个类似的问题,但是想知道是否有更高效/标准的。
I found a similar question here, but was wondering if there was anything more efficient / standard.
推荐答案
Harper的答案是正确和优雅的。当然是经验丰富的SQL编码器中的标准。这个要求当然是db必须归一化:父不重复;父母::孩子有两个关系; Child表中有两个唯一索引(ParentKey,ChildKey)和(ChildKey,ParentKey),否则所有的赌注都关闭。不可能获得比这更好的性能(假设服务器配置正确的硬件等)。下一步是6NF,这确实提供了显着的性能提升,但你不需要去那里,除非你必须。如果您的B小于您的A,那将非常快。
Harper's answer is correct and elegant. Certainly the "standard" among experienced SQL coders. The requirement is of course the db must be normalised: Parent is not duplicated; Parent::Child has two relations; there are two unique indices (ParentKey, ChildKey) and (ChildKey, ParentKey) in the Child table, "otherwise all bets are off". It is not possible to get better performance than that (assuming the server is configured properly for the hardware, etc). The next step is 6NF, which does provide a significant increase in performance, but you do not need to go there unless you have to. If your Bs are smaller than your As, it will be very fast.
另一种方法是使用子查询。根据您的Db供应商,子查询(特别是如果您的B小于您的A)可以更快。例如。 Sybase处理比MS更好的子查询。
The alternative is to use subqueries. Depending on your Db vendor, subqueries (particularly if your Bs are smaller than your As) can be faster. Eg. Sybase handles subqueries far better than MS.
这篇关于查找SQL中超集的子集的所有集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!