问题描述
假设我有三个表A,B和C.每个表都有两列:主键和其他一些数据.它们每个具有相同的行数.如果我在主键上JOIN
A和B,则我应该得到与其中任一行相同的行数(而不是A.rows * B.rows).
Let's say I have three tables A, B, and C. Each has two columns: a primary key and some other piece of data. They each have the same number of rows. If I JOIN
A and B on the primary key, I should end up with the same number of rows as are in either of them (as opposed to A.rows * B.rows).
现在,如果我将JOIN
A JOIN B
与C
一起使用,为什么我最终会出现重复的行?我有几次遇到这个问题,我不理解.似乎它应该产生与JOIN
ing A
和B
相同的结果,因为它具有相同的行数,但是会产生重复项.
Now, if I JOIN
A JOIN B
with C
, why do I end up with duplicate rows? I have run into this problem on several occasions and I do not understand it. It seems like it should produce the same result as JOIN
ing A
and B
since it has the same number of rows but, instead, duplicates are produced.
产生此类结果的查询的格式
Queries that produce results like this are of the format
SELECT *
FROM M
INNER JOIN S
on M.mIndex = S.mIndex
INNER JOIN D
ON M.platformId LIKE '%' + D.version + '%'
INNER JOIN H
ON D.Name = H.Name
AND D.revision = H.revision
这是表的架构. H包含一个历史表,其中包含D中曾经存在的所有内容.每个D有很多M行,每个M有一个S.
Here are schemas for the tables. H contains is a historic table containing everything that was ever in D. There are many M rows for each D and one S for each M.
表M
[mIndex] [int] NOT NULL PRIMARY KEY,
[platformId] [nvarchar](256) NULL,
[ip] [nvarchar](64) NULL,
[complete] [bit] NOT NULL,
[date] [datetime] NOT NULL,
[DeployId] [int] NOT NULL PRIMARY KEY REFERENCES D.DeployId,
[source] [nvarchar](64) NOT NULL PRIMARY KEY
表S
[order] [int] NOT NULL PRIMARY KEY,
[name] [nvarchar](64) NOT NULL,
[parameters] [nvarchar](256) NOT NULL,
[Finished] [bit] NOT NULL,
[mIndex] [int] NOT NULL PRIMARY KEY,
[mDeployId] [int] NOT NULL PRIMARY KEY,
[Date] [datetime] NULL,
[status] [nvarchar](10) NULL,
[output] [nvarchar](max) NULL,
[config] [nvarchar](64) NOT NULL PRIMARY KEY
表D
[Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[branch] [nvarchar](64) NOT NULL,
[revision] [int] NOT NULL,
[version] [nvarchar](64) NOT NULL,
[path] [nvarchar](256) NOT NULL
表H
[IdDeploy] [int] IDENTITY(1,1) NOT NULL,
[name] [nvarchar](64) NOT NULL,
[version] [nvarchar](64) NOT NULL,
[path] [nvarchar](max) NOT NULL,
[StartDate] [datetime] NOT NULL,
[EndDate] [datetime] NULL,
[Revision] [nvarchar](64) NULL,
我之所以没有发布表格和查询,是因为我对自己了解这个问题并在将来避免出现这个问题更感兴趣.
I didn't post the tables and query initially because I am more interested in understanding this problem for myself and avoiding it in the future.
推荐答案
如果表M
,S
,D
或H
之一对于给定的Id
(如果仅Id
列不是主键),则查询将导致重复"行.如果表中Id
的行多于一列,则将唯一标识一行的其他列也必须包含在JOIN条件中.
If one of the tables M
, S
, D
, or H
has more than one row for a given Id
(if just the Id
column is not the Primary Key), then the query would result in "duplicate" rows. If you have more than one row for an Id
in a table, then the other columns, which would uniquely identify a row, also must be included in the JOIN condition(s).
参考:
这篇关于为什么多表联接会产生重复的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!