问题描述
这与以下问题有关:根据计数总和将记录拆分到存储桶中
新的问题是,一个人可以有很多传真,并且由于我试图吸引一定数量的人:我最终得到的东西少于所需的数量,因为同一个人可以放入多个传真桶中.
在所附的示例中,是否有一种方法可以获取不重复的人数?
这是我的桌子,叫做NR_PVO_120
OtherID Fax
12365092 2762364204
12005656 2762364204
12484936 2762364204
39003042 2762364204
12365597 2762364204
12635922 2762364204
12332346 2762364204
12365092 4387267572
12005656 4387267572
12365092 4422911281
12005656 4422911281
12484936 4422911281
12651239 4422911281
12388710 4422911281
12686953 4422911281
12365092 4423311213
12005656 4423311213
12709544 4423311213
12484936 4423311213
12005656 4424450542
12346839 4424450542
12365120 4424450542
12484936 4424450542
12086512 4424450542
基于此表,我创建一个查询,该查询在使用以下查询链接的函数中使用
SELECT Fax
,COUNT(OtherID) CountOfPracs
FROM NR_PVO_120
GROUP BY Fax
所以计数看起来像这样
Fax CountOfPeople
4422911281 6
4387267572 2
4423311213 4
4424450542 5
2762364204 7
如果将所有计数加起来,您将得到24个人,但实际上只有14个人,因为一个人可以进行多份传真.
有没有一种方法可以只计算第二个传真中未包括在第一个传真中的人数.那么对于第三个传真,只计算前两个中未计算的那些,依此类推?
所以结果应该是
2762364204 7
4387267572 0
4422911281 3
4423311213 1
4424450542 3
the first fax has 7 people
the second fax has 2 but both of those were already counted in the first fax so no new unique people were added
the third fax has 4 but only 1 of those hasn't already been counted
the fourth fax has 5 but only 1 hasn't been counted earlier
the fifth fax has 7 but only 3 weren't counted earlier
我知道这不是创建计数的方法,它们不是正确的数据明智的方法,但是没关系.我只是想获取所有具有一定人数的传真号码.假设我需要10个人,我必须挑选10个人,但同时确保所有传真号码都在一起.如果您查看我的NR_PVO_120表并查看前10个人,您会看到9号开始的另一个传真号码一直传播到11.我不会接受该传真.我找到了一个有1个人附着的传真,或者,如果没有,则我停在9.要点是要容纳10个人,但要确保所有具有相同传真的人都组合在一起.
或者还有其他方法可以仅对UNIQUE提供程序进行计数(应该等于14)吗?
我制作了用于测试的表:
create table nr_pvo_120 (
otherid,
fax
)
as
select 12365092 , 2762364204 from dual union all
select 12005656 , 2762364204 from dual union all
select 12484936 , 2762364204 from dual union all
select 39003042 , 2762364204 from dual union all
select 12365597 , 2762364204 from dual union all
select 12635922 , 2762364204 from dual union all
select 12332346 , 2762364204 from dual union all
select 12365092 , 4387267572 from dual union all
select 12005656 , 4387267572 from dual union all
select 12365092 , 4422911281 from dual union all
select 12005656 , 4422911281 from dual union all
select 12484936 , 4422911281 from dual union all
select 12651239 , 4422911281 from dual union all
select 12388710 , 4422911281 from dual union all
select 12686953 , 4422911281 from dual union all
select 12365092 , 4423311213 from dual union all
select 12005656 , 4423311213 from dual union all
select 12709544 , 4423311213 from dual union all
select 12484936 , 4423311213 from dual union all
select 12005656 , 4424450542 from dual union all
select 12346839 , 4424450542 from dual union all
select 12365120 , 4424450542 from dual union all
select 12484936 , 4424450542 from dual union all
select 12086512 , 4424450542 from dual
/
我的第一枪是:对于每个人(otherid),仅获取他的 first 传真号码,然后进行常规分组并以此为依据:
select first_fax, count(*) firstcount
from (
select otherid, min(fax) first_fax
from nr_pvo_120
group by otherid
)
group by first_fax
order by first_fax
/
输出将变为:
FIRST_FAX FIRSTCOUNT
---------- ----------
2762364204 7
4422911281 3
4423311213 1
4424450542 3
然后,我注意到您想要的输出包括第五个传真号码,但计数为零.例如,可以这样做:
select fax, count(*) normalcount, count(otherid_on_first_fax) countunused
from (
select fax, otherid,
case
when fax = min(fax) over (partition by otherid order by fax)
then otherid
end otherid_on_first_fax
from nr_pvo_120
)
group by fax
order by fax
/
在此输出中,列NORMALCOUNT
是具有该传真的人数. COUNTUNUSED
列是先前计数中尚未已使用" 的人数:
FAX NORMALCOUNT COUNTUNUSED
---------- ----------- -----------
2762364204 7 7
4387267572 2 0
4422911281 6 3
4423311213 4 1
4424450542 5 3
诀窍在于,otherid_on_first_fax
在人员第一个传真号码上仅具有otherid
的值,对于其余人员的传真号码otherid_on_first_fax
为NULL. count(otherid_on_first_fax)
然后计算所有非空值,对于传真4387267572则不存在.
This is related to the question below:split records into buckets based on a sum of counts
The new problem is that one person can have many faxes, and since I'm trying to get a certain number of people: I usually end up with less than I need, because the same people land in more than one faxbucket
Is there a way to get a count of UNIQUE people in the attached example?
here's my table, it's called NR_PVO_120
OtherID Fax
12365092 2762364204
12005656 2762364204
12484936 2762364204
39003042 2762364204
12365597 2762364204
12635922 2762364204
12332346 2762364204
12365092 4387267572
12005656 4387267572
12365092 4422911281
12005656 4422911281
12484936 4422911281
12651239 4422911281
12388710 4422911281
12686953 4422911281
12365092 4423311213
12005656 4423311213
12709544 4423311213
12484936 4423311213
12005656 4424450542
12346839 4424450542
12365120 4424450542
12484936 4424450542
12086512 4424450542
Based on this table i create a query that's used in the function linked using the following query
SELECT Fax
,COUNT(OtherID) CountOfPracs
FROM NR_PVO_120
GROUP BY Fax
So the counts would look like this
Fax CountOfPeople
4422911281 6
4387267572 2
4423311213 4
4424450542 5
2762364204 7
If you add up all counts you're getting 24 people but in reality there are only 14 because one person can have multiple faxes.
is there a way to only count those people for the second fax that don't weren't counted in the first one. then for the 3rd fax only count those that weren't counted in the first two and so on?
so the results would be
2762364204 7
4387267572 0
4422911281 3
4423311213 1
4424450542 3
the first fax has 7 people
the second fax has 2 but both of those were already counted in the first fax so no new unique people were added
the third fax has 4 but only 1 of those hasn't already been counted
the fourth fax has 5 but only 1 hasn't been counted earlier
the fifth fax has 7 but only 3 weren't counted earlier
i know this isn't the way to create counts, they're not correct data wise but that's OK. i'm just trying to get all the fax numbers that have a certain number of people. let's say i need 10 people, i have to pick out that 10 but at the same time make sure that all the fax numbers stay together. if you look at my NR_PVO_120 table and look at the first 10 people you see that the 9th starts another fax number that spreads until 11. i won't take that fax. i ether find a fax with 1 person attached or, if there are none, i stop at 9. the point is to get 10 people but make sure all people with the same fax are grouped together.
or is there some other way to only count the UNIQUE providers (which should equal to 14)?
I made the table for testing:
create table nr_pvo_120 (
otherid,
fax
)
as
select 12365092 , 2762364204 from dual union all
select 12005656 , 2762364204 from dual union all
select 12484936 , 2762364204 from dual union all
select 39003042 , 2762364204 from dual union all
select 12365597 , 2762364204 from dual union all
select 12635922 , 2762364204 from dual union all
select 12332346 , 2762364204 from dual union all
select 12365092 , 4387267572 from dual union all
select 12005656 , 4387267572 from dual union all
select 12365092 , 4422911281 from dual union all
select 12005656 , 4422911281 from dual union all
select 12484936 , 4422911281 from dual union all
select 12651239 , 4422911281 from dual union all
select 12388710 , 4422911281 from dual union all
select 12686953 , 4422911281 from dual union all
select 12365092 , 4423311213 from dual union all
select 12005656 , 4423311213 from dual union all
select 12709544 , 4423311213 from dual union all
select 12484936 , 4423311213 from dual union all
select 12005656 , 4424450542 from dual union all
select 12346839 , 4424450542 from dual union all
select 12365120 , 4424450542 from dual union all
select 12484936 , 4424450542 from dual union all
select 12086512 , 4424450542 from dual
/
My first shot would be: For each person (otherid) get his first fax number only and then do a normal group by and count on that:
select first_fax, count(*) firstcount
from (
select otherid, min(fax) first_fax
from nr_pvo_120
group by otherid
)
group by first_fax
order by first_fax
/
The output will become:
FIRST_FAX FIRSTCOUNT
---------- ----------
2762364204 7
4422911281 3
4423311213 1
4424450542 3
Then I noticed your desired output included the fifth fax number but with a count of zero. That can for example be done like this:
select fax, count(*) normalcount, count(otherid_on_first_fax) countunused
from (
select fax, otherid,
case
when fax = min(fax) over (partition by otherid order by fax)
then otherid
end otherid_on_first_fax
from nr_pvo_120
)
group by fax
order by fax
/
In this output, column NORMALCOUNT
is the number of people having that fax. Column COUNTUNUSED
is the number of people that haven't already been "used" in the previous counts:
FAX NORMALCOUNT COUNTUNUSED
---------- ----------- -----------
2762364204 7 7
4387267572 2 0
4422911281 6 3
4423311213 4 1
4424450542 5 3
The trick is that otherid_on_first_fax
only has the value of otherid
on the persons first fax number, for the rest of the persons fax numbers otherid_on_first_fax
is NULL. count(otherid_on_first_fax)
then counts all the non-null values, of which there are none for fax 4387267572.
这篇关于获取每个值的UNIQUE记录总数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!