我正在尝试按大表中的几列(近18000行)搜索重复的行。问题是查询需要很多时间,我尝试了以下方法:

 SELECT * FROM table_name a, table_name b
 WHERE a.col1 = b.col1
 AND a.col2 = b.col2
 AND a.col3 = b.col3
 AND a.col4 = b.col4
 AND a.id <> b.id


还有这个:

SELECT *
FROM table_name
WHERE col1 IN (
    SELECT col1
    FROM table_name
    GROUP BY col1
    HAVING count(col1) > 1
    )
AND col2 IN (
    SELECT col2
    FROM table_name
    GROUP BY col2
    HAVING count(col2) > 1
    )
AND col3 IN (
    SELECT col3
    FROM table_name
    GROUP BY col3
    HAVING count(col3) > 1
    )
AND col4 IN (
    SELECT col4
    FROM table_name
    GROUP BY col4
    HAVING count(col4) > 1
    )


他们都工作,但是太慢了。有任何想法吗?

最佳答案

您可以尝试使用一个联合的GROUP BY语句,例如:

SELECT * FROM table_name
    GROUP BY col1, col2, col3, col4
    HAVING count(*) > 1


至少,它看起来会更干净。

编辑

要将所有结果作为子集返回给上一列:

SELECT *
FROM table_name
WHERE col4 IN (
  SELECT col4
  FROM table_name
  WHERE col3 IN (
    SELECT col3
    FROM table_name
    WHERE col2 IN (
      SELECT col2
      FROM table_name
      WHERE col1 IN (
        SELECT col1
        FROM table_name
        GROUP BY col1
        HAVING count(col1) > 1
        )
      )
    )


从概念上讲,这应该为您提供所有结果,从而缩短执行时间。

关于mysql - 在MySQL中按几列获取重复行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/17367742/

10-11 03:08