问题描述
我想删除其观察中没有 NUM=14 个的整个组
I want to delete the whole group that none of its observation has NUM=14
所以是这样的:原始数据
So something likes this:Original DATA
ID NUM
1 14
1 12
1 10
2 13
2 11
2 10
3 14
3 10
由于 ID=2 中没有一个包含 NUM=14,因此我删除了组 2.它应该看起来像这样:
Since none of the ID=2 contain NUM=14, I delete group 2.And it should looks like this:
ID NUM
1 14
1 12
1 10
3 14
3 10
这是我目前所拥有的,但它似乎不起作用.
This is what I have so far, but it doesn't seem to work.
data originaldat;
set newdat;
by ID;
If first.ID then do;
IF NUM EQ 14 then Score = 100;
Else Score = 10;
end;
else SCORE+1;
run;
data newdat;
set newdat;
If score LT 50 then delete;
run;
推荐答案
使用 proc sql
的方法是:
proc sql;
create table newdat as
select *
from originaldat
where ID in (
select ID
from originaldat
where NUM = 14
);
quit;
子查询 为包含 NUM = 14
的观察的组选择 ID
.where
子句然后将所选数据限制为仅这些组.
The sub query selects the ID
s for groups that contain an observation where NUM = 14
. The where
clause then limits the selected data to only these groups.
等效的数据步方法是:
/* Get all the groups that contain an observation where N = 14 */
data keepGroups;
set originaldat;
if NUM = 14;
keep ID;
run;
/* Sort both data sets to ensure the data step merge works as expected */
proc sort data = originaldat;
by ID;
run;
/* Make sure there are no duplicates values in the groups to be kept */
proc sort data = keepGroups nodupkey;
by ID;
run;
/*
Merge the original data with the groups to keep and only keep records
where an observation exists in the groups to keep dataset
*/
data newdat;
merge
originaldat
keepGroups (in = k);
by ID;
if k;
run;
在两个数据集中,子集 if
语句用于仅在满足条件时输出观察结果.在第二种情况下,k
是一个临时变量,当从 keepGroups
和 0读取值时,其值为
1
(true)代码>(false) 否则.
In both datasets the subsetting if
statement is used to only output observations when the condition is met. In the second case k
is a temporary variable with value 1
(true) when a value is read from keepGroups
an 0
(false) otherwise.
这篇关于删除其观察中不包含SAS中特定值的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!