本文介绍了大表查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一些非常大的表具有标识列键,但我们认为行中的大量实际数据可能会重复。  我们正试图找出数以亿计的实际不同行数。  
有些表格有几列  我们尝试使用子查询('select distinct')运行'select count(*)' 

We have some very large tables that have identity columns keys, but we think a lot of the actual data in the rows might be duplicated.   We are trying to find out how many actual different rows there are out of the hundreds of millions.   Some of the tables have several columns   We tried to run a 'select count(*)' with a subquery ('select distinct') 

 

从(从表格中选择不同的column1,column2,column3,... column40)中选择count(*)q1

select count(*) from (select distinct column1, column2, column3, ...column40 from tablex) q1

 

我们在其中一张桌子上尝试了这个并且运行了很长时间。  我不知道是否有人可能有更好的选择。 

We tried this on one of the tables and it ran a long time.   I didn't know if anyone might have better options. 

推荐答案

 

从(从表格中选择不同的column1,column2,column3,... column40)中选择count(*)q1

select count(*) from (select distinct column1, column2, column3, ...column40 from tablex) q1

 

我们在其中一个表上尝试了这个并且运行了很长时间。  我不知道是否有人可能有更好的选择。 

We tried this on one of the tables and it ran a long time.   I didn't know if anyone might have better options. 




您需要在select distinct中包含所需的列,基于此需要识别重复项


you need to include only required columns inside select distinct based on which you need to identify the duplicates


这篇关于大表查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:53