我有一个包含四列的content
集合; id
,timestamp
,locationID
和authorID
。这是我的数据示例;在生产中,这是几千万行的长度。
id timestamp locationID authorID
1 2012-03-01 11:52:00 1 1
2 2012-03-16 19:56:00 1 2
3 2012-04-02 11:26:00 2 1
4 2012-04-22 11:52:00 2 3
5 2012-05-19 09:48:00 2 2
6 2012-05-30 07:12:00 2 1
7 2012-06-04 19:17:00 1 2
我想收集其最新
authorIDs
(按content
排序)与特定timestamp
匹配的locationID
列表。查询
locationID = 2
的正确值为:[ 1, 3 ]
,因为authorID
1和3是最近在locationID = 2
上看到的,而authorID
2最近的内容是在locationID
1上。我当然可以对每个
authorID
执行一个查询,但是在生产中authorID
数组的长度> 100,000。这似乎效率极低(尤其是当每个“子查询”都将达到该数百万行content
集合时),我正在寻找一种更好的方法来从我的数据集中出现此数据,理想情况下应足够快地在数据库上执行页面呈现。 最佳答案
像这样吗这来自SQL Server,但我认为它也应在mySQL中工作。
DECLARE @locationId INT
SET @locationId = 2;
SELECT *
FROM (SELECT AuthorId, Max(TimeStamp) as MaxTimeStamp
FROM Content C
WHERE LocationId = @locationId
GROUP BY AuthorId) AS CBL
LEFT JOIN Content AS C ON CBL.AuthorId = C.AuthorId
AND C.TimeStamp > CBL.MaxTimeStamp
WHERE C.AuthorId IS NULL
对于locationId = 2,它返回1和3;对于locationId = 1,则返回2
每JW(谢谢!),正确的mySql方法:
SET @locationId := 2;
SELECT *
FROM (SELECT AuthorId, Max(TimeStamp) as MaxTimeStamp
FROM Content C
WHERE LocationId = @locationId
GROUP BY AuthorId) AS CBL
LEFT JOIN Content AS C ON CBL.AuthorId = C.AuthorId
AND C.TimeStamp > CBL.MaxTimeStamp
WHERE C.AuthorId IS NULL
关于mysql - MySQL:找到子文档列表的最新值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/14168261/