使用临时表删除重复记录

本文介绍了使用临时表删除重复记录的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

以下面的 tsql 查询为例:

Take the tsql query below:

DECLARE @table TABLE(data VARCHAR(20))
INSERT INTO @table VALUES ('not duplicate row')
INSERT INTO @table VALUES ('duplicate row')
INSERT INTO @table VALUES ('duplicate row')
INSERT INTO @table VALUES ('second duplicate row')
INSERT INTO @table VALUES ('second duplicate row')

SELECT   data
INTO     #duplicates
FROM     @table
GROUP BY data
HAVING   COUNT(*) > 1

-- delete all rows that are duplicated
DELETE   FROM @table
FROM     @table o INNER JOIN #duplicates d
         ON d.data = o.data

-- insert one row for every duplicate set
INSERT   INTO @table(data)
         SELECT   data
         FROM     #duplicates

我明白它在做什么，但逻辑的最后一部分(在为每个重复集插入一行之后)没有意义.我们在哪里有用于 --delete 所有重复行的代码集，它消除了重复项，那么最后一节的部分是什么?

I understand what it is doing, but the last part of logic (after --insert one row for every duplicate set), doesn't make sense. Where we have the set of code for --delete all rows that are duplicated, that gets rid of the duplicates so what's the part of the last section?

找到此查询这里

谢谢

推荐答案

首先，它删除所有有重复的行.即诸行，原也.在上述情况下，DELETE 后，表中将仅保留一行('非重复行').所有其他四行都将被删除.

First, it deletes all rows that ever had duplicates. That is, all rows, and original also. In the case above, only one row ('not duplicate row') will remain in the table after DELETE. All four other rows will be deleted.

然后再次使用已删除的行填充表，但现在删除了重复项.

Then is populates the table with the deleted rows again, but now the duplicates are removed.

这不是删除重复项的最佳方法.

This is not the best way to delete duplicates.

最好的方法是:

WITH q AS (
          SELECT data, ROW_NUMBER() OVER (PARTITION BY data ORDER BY data) AS rn
          FROM @table
          )
DELETE
FROM      q
WHERE     rn > 1

这篇关于使用临时表删除重复记录的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！