我找不到一种抽象和通用的方式来描述我的问题,所以我只提供一个最小的例子:
假设我有这 3 个简单的表:
CREATE TABLE Document(
[Id] int IDENTITY(1, 1) NOT NULL PRIMARY KEY,
[Title] nvarchar(MAX),
[Patient] nvarchar(MAX)
);
CREATE TABLE Link(
DocumentId INT FOREIGN KEY REFERENCES Document(Id),
Text nvarchar(max)
);
CREATE TABLE ReadStatus(
DocumentId INT FOREIGN KEY REFERENCES Document(Id),
IsRead Bit NOT NULL,
UserId Int NOT NULL
);
ReadStatus
表跟踪,该表将用户与文档相关联,其中 IsRead=1
表示该文档已被该用户读取,而 IsRead=0
表示该用户尚未读取该文档。 X
和用户 A
,在 ReadStatus
表中不存在一行,我们假设用户 A
还没有读取文档 X
。 现在,我需要运行一个查询来选择所有患者。对于每位患者,我需要可用文档的总数和已阅读的文档数量(即
IsRead=1
)。这是我到目前为止:SELECT d.Patient,
COUNT(DISTINCT d.Id) AS DocumentCount,
COUNT(NULLIF(rs.IsRead,0)) AS ReadDocumentCount,
COUNT(*) OVER () AS TotalPatientCount
FROM Document d
LEFT OUTER JOIN ReadStatus AS rs ON d.Id = rs.DocumentId AND rs.UserId = 123
INNER JOIN Link AS l ON d.Id = l.DocumentId AND l.Text IN ('Link W', 'Link X', 'Link T', 'Link Z')
GROUP BY d.Patient
当一个文档(已经阅读过)有多个链接时,就会出现问题。如果该文档有 3 个链接,则 INNER JOIN 与链接表产生的笛卡尔积将导致
ReadDocumentCount
选择为 3 而不是 1。换句话说,鉴于此数据:
INSERT INTO Document(Title, Patient) VALUES('Doc A', 'Mike')
INSERT INTO Document(Title, Patient) VALUES('Doc B', 'Mike')
INSERT INTO Link(DocumentId, Text) VALUES(1, N'Link W')
INSERT INTO Link(DocumentId, Text) VALUES(1, N'Link X')
INSERT INTO Link(DocumentId, Text) VALUES(1, N'Link Y')
INSERT INTO Link(DocumentId, Text) VALUES(2, N'Link Z')
INSERT INTO ReadStatus(DocumentID, IsRead, UserId) VALUES(1, 1, 123)
INSERT INTO ReadStatus(DocumentID, IsRead, UserId) VALUES(2, 0, 123)
我得到这个结果:
Patient DocumentCount ReadDocumentCount TotalPatientCount
Mike 2 3 1
而这正是我想要的:
Patient DocumentCount ReadDocumentCount TotalPatientCount
Mike 2 1 1
SQL fiddle :http://sqlfiddle.com/#!6/e06bf/3
最佳答案
您也可以有条件地使用 COUNT(DISTINCT)
:
SELECT d.Patient,
COUNT(DISTINCT d.Id) AS DocumentCount,
COUNT(DISTINCT (CASE WHEN rs.IsRead <> 0 THEN d.id END)) AS ReadDocumentCount,
COUNT(*) OVER () AS TotalPatientCount
FROM Document d LEFT OUTER JOIN
ReadStatus rs
ON d.Id = rs.DocumentId AND rs.UserId = 123 INNER JOIN
Link l
ON d.Id = l.DocumentId AND l.Text IN ('Link W', 'Link X', 'Link T', 'Link Z')
GROUP BY d.Patient;
关于sql - 在忽略重复项的同时计算子组的行数,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/34529295/