问题描述
数据集如下所示:
colx coly colz
0 1 0
0 1 1
0 1 0
所需输出:
Colname value count
colx 0 3
coly 1 3
colz 0 2
colz 1 1
以下代码完美运行...
The following code works perfectly...
ods output onewayfreqs=outfreq;
proc freq data=final;
tables colx coly colz / nocum nofreq;
run;
data freq;
retain colname column_value;
set outfreq;
colname = scan(tables, 2, ' ');
column_Value = trim(left(vvaluex(colname)));
keep colname column_value frequency percent;
run;
...但我认为这效率不高.假设我有 1000 列,在所有 1000 列上运行 prof freq 效率不高.除了使用 proc freq 来完成我想要的输出之外,还有其他有效的方法吗?
... but I believe that's not efficient. Say I have 1000 columns, running prof freq on all 1000 columns is not efficient. Is there any other efficient way with out using the proc freq that accomplishes my desired output?
推荐答案
计算频率计数的最有效机制之一是通过 suminc
标签设置用于引用计数的哈希对象.
One of the most efficient mechanisms for computing frequency counts is through a hash object set up for reference counting via the suminc
tag.
哈希对象 - 维护密钥摘要"的 SAS 文档演示了用于单个变量的技术.下面的示例更进一步,计算数组中指定的每个变量.suminc:'one'
指定每次使用 ref
都会将 one
的值添加到内部引用和.在迭代输出的不同键时,通过 sum
方法提取频率计数.
The SAS documentation for "Hash Object - Maintaining Key Summaries" demonstrates the technique for a single variable. The following example goes one step further and computes for each variable specified in an array. The suminc:'one'
specifies that each use of ref
will add the value of one
to an internal reference sum. While iterating over the distinct keys for output, the frequency count is extracted via the sum
method.
* one million data values;
data have;
array v(1000);
do row = 1 to 1000;
do index = 1 to dim(v);
v(index) = ceil(100*ranuni(123));
end;
output;
end;
keep v:;
format v: 4.;
run;
* compute frequency counts via .ref();
data freak_out(keep=name value count);
length name $32 value 8;
declare hash bins(ordered:'a', suminc:'one');
bins.defineKey('name', 'value');
bins.defineData('name', 'value');
bins.defineDone();
one = 1;
do until (end_of_data);
set have end=end_of_data;
array v v1-v1000;
do index = 1 to dim(v);
name = vname(v(index));
value = v(index);
bins.ref();
end;
end;
declare hiter out('bins');
do while (out.next() = 0);
bins.sum(sum:count);
output;
end;
run;
注意 Proc FREQ
使用标准语法,变量可以是字符和数字的混合,并且有许多通过选项指定的附加功能.
Note Proc FREQ
uses standard grammars, variables can be a mixed of character and numeric, and has lots of additional features that are specified through options.
这篇关于来自数据集的 SAS 汇总统计量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!