本文介绍了一个表中的哪一行(按某列顺序)对应于另一个表中的一行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个表,如下所示(从实际简化):

I have two tables, as follows (simplified from actual):


mysql> desc small_table;
+-----------------+---------------+------+-----+---------+-------+
| Field           | Type          | Null | Key | Default | Extra |
+-----------------+---------------+------+-----+---------+-------+
| event_time      | datetime      | NO   |     | NULL    |       |
| user_id         | char(15)      | NO   |     | NULL    |       |
| other_data      | int(11)       | NO   | MUL | NULL    |       |
+-----------------+---------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

mysql> desc large_table;
+-----------------+---------------+------+-----+---------+-------+
| Field           | Type          | Null | Key | Default | Extra |
+-----------------+---------------+------+-----+---------+-------+
| event_time      | datetime      | NO   |     | NULL    |       |
| user_id         | char(15)      | NO   |     | NULL    |       |
| other_data      | int(11)       | NO   |     | NULL    |       |
+-----------------+---------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

现在,small_table非常小:对于每个user_id,通常只有一行(尽管有时会有更多行).另一方面,在large_table中,每个user_id出现多次.

Now, small_table is, well, small: for each user_id there is usually only one row (though there are sometimes more). In large_table, on the other hand, each user_id appears numerous times.


mysql> select count(1) from small_table\G
*************************** 1. row ***************************
count(1): 20182
1 row in set (0.00 sec)


mysql> select count(1) from large_table\G
*************************** 1. row ***************************
count(1): 2870522
1 row in set (0.00 sec)

但是,这很重要,对于small_table中的每一行,large_table中至少有一行具有相同的user_id,相同的other_data和相似的event_time(在内部相同)几分钟).

However, and this is important, for each row in small_table, there is at least one row in large_table with the same user_id, the same other_data, and similar event_time (the same within a few minutes, say).

我想知道对于相同的user_id和相似的event_timesmall_table是否具有与第一或第二个或large_table中任何不同的不同的行相对应的行>.也就是说,我想要:

I want to know whether small_table has a row corresponding to the first, or the second, or the whatever distinct row in large_table for the same user_id and similar event_time. That is, I'd like:

    对于每个user_id
  1. 按照event_time的顺序对large_table的不同行进行计数,但仅对于event_time,例如三个小时之内;就是说,我只求例如三个小时之内具有event_time的行的计数;和
  2. 对于每个这样的不同行集合,标识该列表中的哪一行(按event_time顺序)在small_table中具有对应的行.
  1. for each user_id, a count of distinct rows of large_table in order by event_time, but only for event_time within, say, three hours; that is, I seek only the count of such rows as have event_time within, say, three hours of one another; and
  2. for each such collection of distinct rows, an identification of which row in that list (in order by event_time) has a corresponding row in small_table.

即使执行第一个步骤的查询,我似乎也无法编写,更不用说执行第二个查询的查询了.

I don't seem able to write even a query that will do the first step, let alone a query that will do the second, and would appreciate any direction.

推荐答案

此必要的SQL是残酷的;它将为您的优化程序提供非常认真的锻炼.

The necessary SQL for this is brutal; it will give your optimizer a really rather serious workout.

从问题和问题后面的评论来看,如果给定用户ID的事件序列都落在相邻事件之间的某个固定间隔内,则希望将大表中给定用户ID的事件序列视为连续".例如,固定间隔为3小时.我正在为IBM Informix Dynamic Server编码(出于争论,版本为11.70,但11.50也可以正常工作).这意味着我需要解释一个特殊的符号.具体而言,符号3 UNITS HOUR表示3小时的间隔.也可以在SQL的Informix方言中将其写为INTERVAL(3) HOUR TO HOUR,或者在标准SQL中将其写为INTERVAL '3' HOUR.

Judging from the comments after the question as well as the question, the desire is to treat sequences of events for a given user ID in the large table as 'contiguous' if they all fall within some fixed interval between adjacent events. For the sake of example, the fixed interval will be 3 hours. I'm coding for IBM Informix Dynamic Server (for sake of argument, version 11.70, but 11.50 would also work fine). This means that there is an idiosyncratic notation I need to explain. Specifically, the notation 3 UNITS HOUR denotes an interval of 3 hours; it could also be written INTERVAL(3) HOUR TO HOUR in the Informix dialect of SQL, or as INTERVAL '3' HOUR in standard SQL.

在生成SQL(特别是复杂的SQL)时,有几种关键技术.一种是逐步构建SQL,逐步组装最终结果.另一个是要确保对所要处理的东西有一个清晰的说明.

There are a couple of crucial techniques in generating SQL, especially complex SQL. One is to build the SQL up in steps, piecemeal, assembling the final result. The other is to ensure that you have a clear specification of what it is you are after.

在下面的表示法中,应将同一User_ID"的限定条件始终视为表达式的一部分.

In the notation that follows, the qualification 'for the same User_ID' should be taken as always being part of the expression.

在大表中,我们希望在与小表连接之前要考虑三类范围.

In the large table, there are three categories of range that we want to consider before joining with the small table.

  1. 条目中既没有事件发生前的事件时间已经足够近,也没有事件发生后的事件时间已经足够近了.这是一个具有相同开始时间和结束时间的时间范围.
  2. 表中的一对条目,它们本身足够接近,但是对于它们而言,既没有比该事件早于足够接近的事件发生,也没有比该事件的晚得足够晚的事件发生.两人之间也没有事件.
  3. 表中三个或三个以上条目的序列,其中包含:
    • 没有事件E1早于足够接近的最早事件
    • 事件E2的发生时间不得晚于最近发生的事件
    • 事件E3比最早的事件晚,距最早的事件很近
    • 早于最新事件的事件E4与最新事件足够接近(E4可能与E3是同一事件)
    • 最早和最晚之间没有事件E5,E6,在E5和E6之间没有事件,但是E5和E6之间的差距太大,无法计数.
  1. Entries where there is neither a row with an event time before the event that is close enough nor a row with an event time after the event that is close enough. This is a time range with the same start and end time.
  2. A pair of entries in the table that are themselves close enough but for which there is neither an event earlier than the early event of the pair that is close enough nor an event later than the late event of the pair that is close enough nor an event in between the pair.
  3. A sequence of three or more entries in the table for which there is:
    • No event E1 earlier than the earliest that is close enough
    • No event E2 later than the latest that is close enough
    • An event E3 later than the earliest that is close enough to the earliest
    • An event E4 earlier than the latest that is close enough to the latest (E4 might possibly be the same event as E3)
    • No pair of events E5, E6 between the earliest and the latest where there is no event in between E5 and E6 but the gap between E5 and E6 is too large to count.

从描述中可以看到,这将是一些可怕的SQL!

As you can see from the description, this is going to be some scary SQL!

NB:该代码现已通过测试;一些(主要是小的)更改是必要的.一个小的不必要更改是在中间查询上添加了ORDER BY子句.另一个不必要的更改是从小表中选择其他数据以进行验证.进行此修订时并未研究 msh210 .

还要注意,我不确定这是否是最低限度的表述;用一个SELECT语句而不是三个SELECT语句的UNION对所有范围进行分类可能是可行的(如果那样的话会很好).

Also note that I'm far from certain this is a minimal formulation; it may be feasible to classify all the ranges with a single SELECT statement instead of a UNION of three SELECT statements (and it would be good if that is the case).

-- Ranges of exactly 1 event
SELECT lt1.user_id, lt1.event_time AS min_time, lt1.event_time AS max_time
  FROM Large_Table AS lt1
 WHERE NOT EXISTS -- an earlier event that is close enough
       (SELECT *
          FROM Large_Table AS lt3
         WHERE lt1.user_id = lt3.user_id
           AND lt3.event_time > lt1.event_time - 3 UNITS HOUR
           AND lt3.event_time < lt1.event_time
       )
   AND NOT EXISTS -- a later event that is close enough
       (SELECT *
          FROM Large_Table AS lt4
         WHERE lt1.user_id = lt4.user_id
           AND lt4.event_time > lt1.event_time
           AND lt4.event_time < lt1.event_time + 3 UNITS HOUR
       )
ORDER BY User_ID, Min_Time;

Doubleton Ranges

-- Ranges of exactly 2 events
SELECT lt1.user_id, lt1.event_time AS min_time, lt2.event_time AS max_time
  FROM Large_Table AS lt1
  JOIN Large_Table AS lt2
    ON lt1.user_id = lt2.user_id
   AND lt1.event_time < lt2.event_time
   AND lt2.event_time < lt1.event_time + 3 UNITS HOUR
 WHERE NOT EXISTS -- an earlier event that is close enough
       (SELECT *
          FROM Large_Table AS lt3
         WHERE lt1.user_id = lt3.user_id
           AND lt3.event_time > lt1.event_time - 3 UNITS HOUR
           AND lt3.event_time < lt1.event_time
       )
   AND NOT EXISTS -- a later event that is close enough
       (SELECT *
          FROM Large_Table AS lt4
         WHERE lt1.user_id = lt4.user_id
           AND lt4.event_time > lt2.event_time
           AND lt4.event_time < lt2.event_time + 3 UNITS HOUR
       )
   AND NOT EXISTS -- any event in between
       (SELECT *
          FROM Large_Table AS lt5
         WHERE lt1.user_id = lt5.user_id
           AND lt5.event_time > lt1.event_time
           AND lt5.event_time < lt2.event_time
       )
ORDER BY User_ID, Min_Time;

在外部WHERE子句中添加了3小时标准.

-- Ranges of 3 or more events
SELECT lt1.user_id, lt1.event_time AS min_time, lt2.event_time AS max_time
  FROM Large_Table AS lt1
  JOIN Large_Table AS lt2
    ON lt1.user_id = lt2.user_id
   AND lt1.event_time < lt2.event_time
 WHERE NOT EXISTS -- an earlier event that is close enough
       (SELECT *
          FROM Large_Table AS lt3
         WHERE lt1.user_id = lt3.user_id
           AND lt3.event_time > lt1.event_time - 3 UNITS HOUR
           AND lt3.event_time < lt1.event_time
       )
   AND NOT EXISTS -- a later event that is close enough
       (SELECT *
          FROM Large_Table AS lt4
         WHERE lt1.user_id = lt4.user_id
           AND lt4.event_time > lt2.event_time
           AND lt4.event_time < lt2.event_time + 3 UNITS HOUR
       )
   AND NOT EXISTS -- a gap that's too big in the events between first and last
       (SELECT *
          FROM Large_Table AS lt5 -- E5 before E6
          JOIN Large_Table AS lt6
            ON lt5.user_id = lt6.user_id
           AND lt5.event_time < lt6.event_time
         WHERE lt1.user_id = lt5.user_id
           AND lt6.event_time < lt2.event_time
           AND lt5.event_time > lt1.event_time
           AND (lt6.event_time - lt5.event_time) > 3 UNITS HOUR
           AND NOT EXISTS -- an event in between these two
               (SELECT *
                  FROM Large_Table AS lt9
                 WHERE lt5.user_id = lt9.user_id
                   AND lt9.event_time > lt5.event_time
                   AND lt9.event_time < lt6.event_time
               )
       )
   AND EXISTS -- an event close enough after the start
       (SELECT *
          FROM Large_Table AS lt7
         WHERE lt1.user_id = lt7.user_id
           AND lt1.event_time < lt7.event_time
           AND lt7.event_time < lt1.event_time + 3 UNITS HOUR
           AND lt7.event_time < lt2.event_time
       )
   AND EXISTS -- an event close enough before the end
       (SELECT *
          FROM Large_Table AS lt8
         WHERE lt2.user_id = lt8.user_id
           AND lt2.event_time > lt8.event_time
           AND lt8.event_time > lt2.event_time - 3 UNITS HOUR
           AND lt8.event_time > lt1.event_time
       )
ORDER BY User_ID, Min_Time;

在大空白"子查询中添加了省略的嵌套NOT NOTISTS子句.

很显然,最后一个表中的范围的完整列表是上面三个查询的并集.

Clearly, the complete list of ranges in the last table are the union of the three queries above.

查询已删除,因为不够有趣.这只是上面单独查询的三向UNION.

最终查询将查找可怕的3部分UNION结果的范围(如果有的话),该结果与小表中的条目足够接近.小表中的一个条目可能位于例如13:00,大表中可能有一个范围从11:00结束,另一个范围从15:00开始.大表中的两个范围是分开的(它们之间的间隔是4个小时),但是小表中的条目足够接近以进行计数. [测试涵盖了这种情况.]

The final query finds the ranges, if any, in the result of the gruesome 3-part UNION which is close enough to the entry in the small table. A single entry in the small table might fall at, say, 13:00, and there might be a range in the large table that ends at 11:00 and another that starts at 15:00. The two ranges from the large table are separate (the gap between them is 4 hours), but the entry in the small table is close enough to both to count. [The tests cover this case.]

SELECT S.User_id, S.Event_Time, L.Min_Time, L.Max_Time, S.Other_Data
  FROM Small_Table AS S
  JOIN (
        -- Ranges of exactly 1 event
        SELECT lt1.user_id, lt1.event_time AS min_time, lt1.event_time AS max_time
          FROM Large_Table AS lt1
         WHERE NOT EXISTS -- an earlier event that is close enough
               (SELECT *
                  FROM Large_Table AS lt3
                 WHERE lt1.user_id = lt3.user_id
                   AND lt3.event_time > lt1.event_time - 3 UNITS HOUR
                   AND lt3.event_time < lt1.event_time
               )
           AND NOT EXISTS -- a later event that is close enough
               (SELECT *
                  FROM Large_Table AS lt4
                 WHERE lt1.user_id = lt4.user_id
                   AND lt4.event_time > lt1.event_time
                   AND lt4.event_time < lt1.event_time + 3 UNITS HOUR
               )
        UNION
        -- Ranges of exactly 2 events
        SELECT lt1.user_id, lt1.event_time AS min_time, lt2.event_time AS max_time
          FROM Large_Table AS lt1
          JOIN Large_Table AS lt2
            ON lt1.user_id = lt2.user_id
           AND lt1.event_time < lt2.event_time
           AND lt2.event_time < lt1.event_time + 3 UNITS HOUR
         WHERE NOT EXISTS -- an earlier event that is close enough
               (SELECT *
                  FROM Large_Table AS lt3
                 WHERE lt1.user_id = lt3.user_id
                   AND lt3.event_time > lt1.event_time - 3 UNITS HOUR
                   AND lt3.event_time < lt1.event_time
               )
           AND NOT EXISTS -- a later event that is close enough
               (SELECT *
                  FROM Large_Table AS lt4
                 WHERE lt1.user_id = lt4.user_id
                   AND lt4.event_time > lt2.event_time
                   AND lt4.event_time < lt2.event_time + 3 UNITS HOUR
               )
           AND NOT EXISTS -- any event in between
               (SELECT *
                  FROM Large_Table AS lt5
                 WHERE lt1.user_id = lt5.user_id
                   AND lt5.event_time > lt1.event_time
                   AND lt5.event_time < lt2.event_time
               )
        UNION
        -- Ranges of 3 or more events
        SELECT lt1.user_id, lt1.event_time AS min_time, lt2.event_time AS max_time
          FROM Large_Table AS lt1
          JOIN Large_Table AS lt2
            ON lt1.user_id = lt2.user_id
           AND lt1.event_time < lt2.event_time
         WHERE NOT EXISTS -- an earlier event that is close enough
               (SELECT *
                  FROM Large_Table AS lt3
                 WHERE lt1.user_id = lt3.user_id
                   AND lt3.event_time > lt1.event_time - 3 UNITS HOUR
                   AND lt3.event_time < lt1.event_time
               )
           AND NOT EXISTS -- a later event that is close enough
               (SELECT *
                  FROM Large_Table AS lt4
                 WHERE lt1.user_id = lt4.user_id
                   AND lt4.event_time > lt2.event_time
                   AND lt4.event_time < lt2.event_time + 3 UNITS HOUR
               )
           AND NOT EXISTS -- a gap that's too big in the events between first and last
               (SELECT *
                  FROM Large_Table AS lt5 -- E5 before E6
                  JOIN Large_Table AS lt6
                    ON lt5.user_id = lt6.user_id
                   AND lt5.event_time < lt6.event_time
                 WHERE lt1.user_id = lt5.user_id
                   AND lt6.event_time < lt2.event_time
                   AND lt5.event_time > lt1.event_time
                   AND (lt6.event_time - lt5.event_time) > 3 UNITS HOUR
                   AND NOT EXISTS -- an event in between these two
                       (SELECT *
                          FROM Large_Table AS lt9
                         WHERE lt5.user_id = lt9.user_id
                           AND lt9.event_time > lt5.event_time
                           AND lt9.event_time < lt6.event_time
                       )
               )
           AND EXISTS -- an event close enough after the start
               (SELECT *
                  FROM Large_Table AS lt7
                 WHERE lt1.user_id = lt7.user_id
                   AND lt1.event_time < lt7.event_time
                   AND lt7.event_time < lt1.event_time + 3 UNITS HOUR
                   AND lt7.event_time < lt2.event_time
               )
           AND EXISTS -- an event close enough before the end
               (SELECT *
                  FROM Large_Table AS lt8
                 WHERE lt2.user_id = lt8.user_id
                   AND lt2.event_time > lt8.event_time
                   AND lt8.event_time > lt2.event_time - 3 UNITS HOUR
                   AND lt8.event_time > lt1.event_time
               )
       ) AS L
    ON S.User_ID = L.User_ID
 WHERE S.Event_Time > L.Min_Time - 3 UNITS HOUR
   AND S.Event_Time < L.Max_Time + 3 UNITS HOUR
ORDER BY User_ID, Event_Time, Min_Time;


该代码现已通过测试.无限的机会实际上是零;有一个语法错误和一些或多或少的小问题需要解决.

设计测试数据后,我分阶段进行了实验.在验证和修复查询时,我使用了"Alpha"数据(请参见下文),并添加了Beta数据只是为了确保不同的User_ID值之间没有串扰.

I experimented in stages after devising the test data. I used the 'Alpha' data (see below) while validating and fixing the queries, and added the Beta data only to ensure that there was no cross-talk between different User_ID values.

我使用显式的<>操作而不是BETWEEN ... AND来排除端点;如果您希望事件间隔恰好3小时算作足够接近",那么您需要查看每个不等式,可能将其更改为BETWEEN ... AND,或者适当地使用>=<=.

I used explicit < and > operations rather than BETWEEN ... AND to exclude the end points; if you want events exactly 3 hours apart to count as 'close enough', then you need to review each inequality, possibly changing them to BETWEEN ... AND or possibly just using >= or <= as appropriate.

有一个 answer (a)我写过,(b)对上面的复杂处理提出了一些有用的想法(特别是,没有较早的事件但足够近的时间"和没有较晚的事件但足够近的时间"标准) 足够接近"的标准无疑使这个问题复杂化.

There's an answer to a loosely similar but rather simpler question that (a) I wrote and (b) provided some helpful thoughts on the complex processing above (in particular, the 'no event earlier but close enough' and 'no event later but close enough' criteria. The 'close enough' criteria most definitely complicate this question.

CREATE TABLE Large_Table
(
    Event_Time  DATETIME YEAR TO MINUTE NOT NULL,
    User_ID     CHAR(15) NOT NULL,
    Other_Data  INTEGER NOT NULL,
    PRIMARY KEY(User_ID, Event_Time)
);

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 09:15', 'Alpha',  1) { R4 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 11:15', 'Alpha',  2) { R4 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 13:15', 'Alpha',  3) { R4 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 15:15', 'Alpha',  4) { R4 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 12:17', 'Beta',   1) { R4 };

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 09:15', 'Alpha',  5) { R1 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 10:17', 'Beta',   2) { R1 };

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-03 09:15', 'Alpha',  6) { R2 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-03 11:15', 'Alpha',  7) { R2 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-03 10:17', 'Beta',   3) { R1 };

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 09:15', 'Alpha',  8) { R3 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 11:15', 'Alpha',  9) { R3 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 13:15', 'Alpha', 10) { R3 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 10:17', 'Beta',   4) { R1 };

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 09:15', 'Alpha', 11) { R2 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 11:15', 'Alpha', 12) { R2 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 10:17', 'Beta',   5) { R1 };
{ Probe here }
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 15:15', 'Alpha', 13) { R2 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 17:15', 'Alpha', 14) { R2 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 16:17', 'Beta',   6) { R1 };

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 09:15', 'Alpha', 15) { R6 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 11:15', 'Alpha', 16) { R6 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 13:15', 'Alpha', 17) { R6 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 15:15', 'Alpha', 18) { R6 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 17:15', 'Alpha', 19) { R6 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 19:15', 'Alpha', 20) { R6 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 16:17', 'Beta',   7) { R1 };

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 09:15', 'Alpha', 21) { R1 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 11:17', 'Beta',   8) { R1 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 12:15', 'Alpha', 22) { R1 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 13:17', 'Beta',   9) { R1 };

INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 09:15', 'Alpha', 23) { R5 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 11:15', 'Alpha', 24) { R5 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 13:15', 'Alpha', 25) { R5 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 15:15', 'Alpha', 26) { R5 };
INSERT INTO Large_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 17:15', 'Alpha', 27) { R5 };

小桌子

注意:出于测试目的,小表实际上包含比大表更多的行.小表中Other_Data值大于100的行不应出现在结果中(也不会出现在结果中).这里的测试确实是在边缘条件下进行的.

Small Table

Note: for the purposes of testing, the small table actually contains more rows than the large table. The rows in the small table with Other_Data values larger than 100 should not appear in the results (and don't). The tests here do poke at the edge conditions.

CREATE TABLE Small_Table
(
    Event_Time  DATETIME YEAR TO MINUTE NOT NULL,
    User_ID     CHAR(15) NOT NULL,
    Other_Data  INTEGER NOT NULL,
    PRIMARY KEY(User_ID, Event_Time)
);

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 06:15', 'Alpha', 131) { XX };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 06:20', 'Alpha',  31) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 10:20', 'Alpha',  32) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 13:20', 'Alpha',  33) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 15:20', 'Alpha',  34) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 18:15', 'Alpha', 134) { XX };

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 06:15', 'Alpha', 135) { XX };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 06:16', 'Alpha',  35) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 10:20', 'Alpha',  35) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 12:14', 'Alpha',  35) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 12:15', 'Alpha', 135) { XX };

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-03 09:20', 'Alpha',  36) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-03 11:20', 'Alpha',  37) { YY };

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 09:20', 'Alpha',  38) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 11:20', 'Alpha',  39) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 13:20', 'Alpha',  40) { YY };

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 09:20', 'Alpha',  41) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 11:20', 'Alpha',  42) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 13:20', 'Alpha',  42) { 22 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 15:20', 'Alpha',  43) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 17:20', 'Alpha',  44) { YY };

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 09:20', 'Alpha',  45) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 11:20', 'Alpha',  46) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 13:20', 'Alpha',  47) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 15:20', 'Alpha',  48) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 17:20', 'Alpha',  49) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 19:20', 'Alpha',  50) { YY };

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 09:20', 'Alpha',  51) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 10:20', 'Alpha',  51) { 22 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 12:20', 'Alpha',  52) { YY };

INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 09:20', 'Alpha',  53) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 11:20', 'Alpha',  54) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 13:20', 'Alpha',  55) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 15:20', 'Alpha',  56) { YY };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-08 17:20', 'Alpha',  57) { YY };


INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 13:27', 'Beta',   9) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-07 11:27', 'Beta',   8) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-06 16:27', 'Beta',   7) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 16:27', 'Beta',   6) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-05 10:27', 'Beta',   5) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-04 10:27', 'Beta',   4) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-03 10:27', 'Beta',   3) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-02 10:27', 'Beta',   2) { R1 };
INSERT INTO Small_Table(Event_Time, User_ID, Other_Data) VALUES('2012-01-01 12:27', 'Beta',   1) { R4 };

最终查询结果

使用上面的数据,获得的结果是:

Final Query Results

Using the data above, the results obtained were:

Alpha   2012-01-01 06:20   2012-01-01 09:15   2012-01-01 15:15   31
Alpha   2012-01-01 10:20   2012-01-01 09:15   2012-01-01 15:15   32
Alpha   2012-01-01 13:20   2012-01-01 09:15   2012-01-01 15:15   33
Alpha   2012-01-01 15:20   2012-01-01 09:15   2012-01-01 15:15   34
Alpha   2012-01-02 06:16   2012-01-02 09:15   2012-01-02 09:15   35
Alpha   2012-01-02 10:20   2012-01-02 09:15   2012-01-02 09:15   35
Alpha   2012-01-02 12:14   2012-01-02 09:15   2012-01-02 09:15   35
Alpha   2012-01-03 09:20   2012-01-03 09:15   2012-01-03 11:15   36
Alpha   2012-01-03 11:20   2012-01-03 09:15   2012-01-03 11:15   37
Alpha   2012-01-04 09:20   2012-01-04 09:15   2012-01-04 13:15   38
Alpha   2012-01-04 11:20   2012-01-04 09:15   2012-01-04 13:15   39
Alpha   2012-01-04 13:20   2012-01-04 09:15   2012-01-04 13:15   40
Alpha   2012-01-05 09:20   2012-01-05 09:15   2012-01-05 11:15   41
Alpha   2012-01-05 11:20   2012-01-05 09:15   2012-01-05 11:15   42
Alpha   2012-01-05 13:20   2012-01-05 09:15   2012-01-05 11:15   42
Alpha   2012-01-05 13:20   2012-01-05 15:15   2012-01-05 17:15   42
Alpha   2012-01-05 15:20   2012-01-05 15:15   2012-01-05 17:15   43
Alpha   2012-01-05 17:20   2012-01-05 15:15   2012-01-05 17:15   44
Alpha   2012-01-06 09:20   2012-01-06 09:15   2012-01-06 19:15   45
Alpha   2012-01-06 11:20   2012-01-06 09:15   2012-01-06 19:15   46
Alpha   2012-01-06 13:20   2012-01-06 09:15   2012-01-06 19:15   47
Alpha   2012-01-06 15:20   2012-01-06 09:15   2012-01-06 19:15   48
Alpha   2012-01-06 17:20   2012-01-06 09:15   2012-01-06 19:15   49
Alpha   2012-01-06 19:20   2012-01-06 09:15   2012-01-06 19:15   50
Alpha   2012-01-07 09:20   2012-01-07 09:15   2012-01-07 09:15   51
Alpha   2012-01-07 09:20   2012-01-07 12:15   2012-01-07 12:15   51
Alpha   2012-01-07 10:20   2012-01-07 09:15   2012-01-07 09:15   51
Alpha   2012-01-07 10:20   2012-01-07 12:15   2012-01-07 12:15   51
Alpha   2012-01-07 12:20   2012-01-07 12:15   2012-01-07 12:15   52
Alpha   2012-01-08 09:20   2012-01-08 09:15   2012-01-08 17:15   53
Alpha   2012-01-08 11:20   2012-01-08 09:15   2012-01-08 17:15   54
Alpha   2012-01-08 13:20   2012-01-08 09:15   2012-01-08 17:15   55
Alpha   2012-01-08 15:20   2012-01-08 09:15   2012-01-08 17:15   56
Alpha   2012-01-08 17:20   2012-01-08 09:15   2012-01-08 17:15   57
Beta    2012-01-01 12:27   2012-01-01 12:17   2012-01-01 12:17    1
Beta    2012-01-02 10:27   2012-01-02 10:17   2012-01-02 10:17    2
Beta    2012-01-03 10:27   2012-01-03 10:17   2012-01-03 10:17    3
Beta    2012-01-04 10:27   2012-01-04 10:17   2012-01-04 10:17    4
Beta    2012-01-05 10:27   2012-01-05 10:17   2012-01-05 10:17    5
Beta    2012-01-05 16:27   2012-01-05 16:17   2012-01-05 16:17    6
Beta    2012-01-06 16:27   2012-01-06 16:17   2012-01-06 16:17    7
Beta    2012-01-07 11:27   2012-01-07 11:17   2012-01-07 13:17    8
Beta    2012-01-07 13:27   2012-01-07 11:17   2012-01-07 13:17    9

中间结果

实际上格式略有不同.

Intermediate results

Slightly different formatting in effect.

单个范围

Alpha|2012-01-02 09:15|2012-01-02 09:15
Alpha|2012-01-07 09:15|2012-01-07 09:15
Alpha|2012-01-07 12:15|2012-01-07 12:15
Beta|2012-01-01 12:17|2012-01-01 12:17
Beta|2012-01-02 10:17|2012-01-02 10:17
Beta|2012-01-03 10:17|2012-01-03 10:17
Beta|2012-01-04 10:17|2012-01-04 10:17
Beta|2012-01-05 10:17|2012-01-05 10:17
Beta|2012-01-05 16:17|2012-01-05 16:17
Beta|2012-01-06 16:17|2012-01-06 16:17

Doubleton范围

Doubleton Ranges

Alpha|2012-01-03 09:15|2012-01-03 11:15
Alpha|2012-01-05 09:15|2012-01-05 11:15
Alpha|2012-01-05 15:15|2012-01-05 17:15
Beta|2012-01-07 11:17|2012-01-07 13:17

多个事件范围

Alpha|2012-01-01 09:15|2012-01-01 15:15
Alpha|2012-01-04 09:15|2012-01-04 13:15
Alpha|2012-01-06 09:15|2012-01-06 19:15
Alpha|2012-01-08 09:15|2012-01-08 17:15

这篇关于一个表中的哪一行(按某列顺序)对应于另一个表中的一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 18:55