查找顺序值的有效方法

本文介绍了查找顺序值的有效方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

每个产品"最多可以有 10000 个细分"行.这些段有一个从 1 开始的排序列(1, 2, 3, 4, 5, ...)和一个值列，可以包含任何值，例如 (323.113, 5423.231, 873.42, 422.64, 763.1,...).

Each 'Product' can have as many as 10000 'Segment' rows. The segments have a sort column that starts at 1 for each product (1, 2, 3, 4, 5, ...) and a value column that can contain any values such as (323.113, 5423.231, 873.42, 422.64, 763.1, ...).

我想确定给定细分市场子集的产品的潜在匹配项.例如，如果我有 5 个按正确顺序排列的段值，我如何才能有效地找到在 Segment 表中某处的所有 5 个段都按相同顺序排列的所有产品?

I would like to identify potential matches for products given a subset of segments. For example, if I have 5 segment values in the correct order, how can I efficiently find all the products which have all 5 segments in the same order somewhere in the Segment table?

更新

我在这里发布了一个后续问题:使用非精确测量(模糊逻辑)查找一系列数据

I posted a follow-up question to this here: Find a series of data using non-exact measurements (fuzzy logic)

推荐答案

假设表格如下:

CREATE TABLE Products
 (
   ProductId  int           not null
    constraint PK_Products
     primary key
  ,Name       varchar(100)  not null
 )

CREATE TABLE Segments
 (
   ProductId   int    not null
    constraint FK_Segments__Products
     foreign key references Products (ProductId)
  ,OrderBy     int    not null
  ,Value       float  not null
  ,constraint PK_Segments
    primary key (ProductId, OrderBy)
 )

接下来，在临时表中设置您的搜索数据:

Next, set up your search data in a temp table:

CREATE TABLE #MatchThis
 (
   Position  int    not null
  ,Value     float  not null
 )

对于 N 个搜索对象，这必须像这样填充

For N search objects, this has to be populated like so

First item   0    <value 1>
Second item  1    <value 2>
Third item   2    <value 3>
...
Nth item     N-1  <value N>

现在设置一些重要的值.(这可能会被塞进最终查询中，但这种方式更易于阅读，并且可能会略微提高性能.)

Now set up some important values. (This could be crammed into the final query, but this way makes it easier to read, and may improve performance slightly.)

DECLARE
  @ItemCount   int
 ,@FirstValue  float

--  How many items to be matched ("N", above)
SELECT @ItemCount = count(*)
 from #MatchThis

--  The value of the first item in the search set
SELECT @FirstValue = Value
 from #MatchThis
 where Position = 0

然后它只是一个查询:

SELECT
   pr.Name
  ,fv.OrderBy  --  Required by the Group By, but otherwise can be ignored
 from #MatchThis mt
  cross join (--  All Segments that match the first value in the set
              select ProductId, OrderBy
               from Segment
               where Value = @FirstValue) fv
  inner join Product pr  --  Just to get the Product name
   on pr.ProductId = fv.ProductId
  inner join Segment se
   on se.ProductId = fv.ProductId
    and se.OrderBy = fv.OrderBy + mt.Position  --  Lines them up based on the first value
    and se.Value = mt.Value                    --  No join if the values don't match
 group by
   pr.Name
  ,fv.OrderBy
 having count(*) = @ItemCount  --  Only include if as many segments pulled for this product/segment.OrderBy as are required

我确信这会奏效，但我现在没有时间对其进行详细测试.为了优化性能，除了指示的主键之外，您还可以在 Segment.Value 上添加常规索引

I am convinced this will work, but I don't have time to test it out in detail just now. To optimize for performance, besides the primary keys indicated you could add a regular index on Segment.Value

这篇关于查找顺序值的有效方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！