问题描述
每个产品"最多可以有 10000 个细分"行.这些段有一个从 1 开始的排序列(1, 2, 3, 4, 5, ...)和一个值列,可以包含任何值,例如 (323.113, 5423.231, 873.42, 422.64, 763.1,...).
Each 'Product' can have as many as 10000 'Segment' rows. The segments have a sort column that starts at 1 for each product (1, 2, 3, 4, 5, ...) and a value column that can contain any values such as (323.113, 5423.231, 873.42, 422.64, 763.1, ...).
我想确定给定细分市场子集的产品的潜在匹配项.例如,如果我有 5 个按正确顺序排列的段值,我如何才能有效地找到在 Segment 表中某处的所有 5 个段都按相同顺序排列的所有产品?
I would like to identify potential matches for products given a subset of segments. For example, if I have 5 segment values in the correct order, how can I efficiently find all the products which have all 5 segments in the same order somewhere in the Segment table?
更新
我在这里发布了一个后续问题:使用非精确测量(模糊逻辑)查找一系列数据
I posted a follow-up question to this here: Find a series of data using non-exact measurements (fuzzy logic)
推荐答案
假设表格如下:
CREATE TABLE Products
(
ProductId int not null
constraint PK_Products
primary key
,Name varchar(100) not null
)
CREATE TABLE Segments
(
ProductId int not null
constraint FK_Segments__Products
foreign key references Products (ProductId)
,OrderBy int not null
,Value float not null
,constraint PK_Segments
primary key (ProductId, OrderBy)
)
接下来,在临时表中设置您的搜索数据:
Next, set up your search data in a temp table:
CREATE TABLE #MatchThis
(
Position int not null
,Value float not null
)
对于 N 个搜索对象,这必须像这样填充
For N search objects, this has to be populated like so
First item 0 <value 1>
Second item 1 <value 2>
Third item 2 <value 3>
...
Nth item N-1 <value N>
现在设置一些重要的值.(这可能会被塞进最终查询中,但这种方式更易于阅读,并且可能会略微提高性能.)
Now set up some important values. (This could be crammed into the final query, but this way makes it easier to read, and may improve performance slightly.)
DECLARE
@ItemCount int
,@FirstValue float
-- How many items to be matched ("N", above)
SELECT @ItemCount = count(*)
from #MatchThis
-- The value of the first item in the search set
SELECT @FirstValue = Value
from #MatchThis
where Position = 0
然后它只是一个查询:
SELECT
pr.Name
,fv.OrderBy -- Required by the Group By, but otherwise can be ignored
from #MatchThis mt
cross join (-- All Segments that match the first value in the set
select ProductId, OrderBy
from Segment
where Value = @FirstValue) fv
inner join Product pr -- Just to get the Product name
on pr.ProductId = fv.ProductId
inner join Segment se
on se.ProductId = fv.ProductId
and se.OrderBy = fv.OrderBy + mt.Position -- Lines them up based on the first value
and se.Value = mt.Value -- No join if the values don't match
group by
pr.Name
,fv.OrderBy
having count(*) = @ItemCount -- Only include if as many segments pulled for this product/segment.OrderBy as are required
我确信这会奏效,但我现在没有时间对其进行详细测试.为了优化性能,除了指示的主键之外,您还可以在 Segment.Value 上添加常规索引
I am convinced this will work, but I don't have time to test it out in detail just now. To optimize for performance, besides the primary keys indicated you could add a regular index on Segment.Value
这篇关于查找顺序值的有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!