


I have rows in a query that return something like:

Date        User    Time    Location    Service     Count
1/1/2018    Nick    12:00   Location A  X           1
1/1/2018    Nick    12:01   Location A  Y           1
1/1/2018    John    12:02   Location B  Z           1
1/1/2018    Harry   12:03   Location A  X           1
1/1/2018    Harry   12:04   Location A  X           1
1/1/2018    Harry   12:05   Location B  Y           1
1/1/2018    Harry   12:06   Location B  X           1
1/1/2018    Nick    12:07   Location A  X           1
1/1/2018    Nick    12:08   Location A  Y           1

,其中查询返回用户访问过的位置以及从该位置进行的选择计数.结果按用户和时间升序排序.我需要将其分组到用户和位置相同的CONSECUTIVE行与Count列的SUM和Service Column中唯一值的逗号分隔列表分组的地方,最终结果将返回如下内容:

where the query returns locations visited by a user and a count of picks done from the location. results are sorted by user and time ascending. I need to group it to where CONSECUTIVE rows with same User and Location are grouped with a SUM of Count column and comma separated list of unique values in Service Column, final result returns something like this:

Date        User    Start Time  End Time    Location    Service Count
1/1/2018    Nick    12:00       12:01       Location A  X,Y     2
1/1/2018    John    12:02       12:02       Location B  Z       1
1/1/2018    Harry   12:03       12:04       Location A  X       2
1/1/2018    Harry   12:05       12:06       Location B  X,Y     2
1/1/2018    Nick    12:07       12:08       Location A  X,Y     2


I'm not sure where to start. Maybe lag or partition clauses? hoping an SQL guru can help here...



This is a gaps and islands problem. One method for solving it uses row_number():

select Date, User, min(Time) as start_time, max(time) as end_time,
       listagg(Service, ',') within group (order by service),
       count(*) as cnt
from (select t.*,
             row_number() over (date order by time) as seqnum,
             row_number() over (partition by user, date, location order by time) as seqnum_2
      from t
     ) t
group by Date, User, Location, (seqnum - seqnum_2);


It is a bit tricky to explain how this works. My suggestion is to run the subquery and you will see how the difference of row numbers defines the groups that you are looking for.


09-05 11:09