问题描述
我想我之前发布过类似的问题.但这一次我正在为数据 ID 苦苦挣扎.
I think I posted similar question before. But this time I am struggling with data ID.
我的数据看起来像
date Stock value standard_deviation
01/01/2015 VOD 18 ...
01/01/2015 VOD 15 ...
01/01/2015 VOD 5 ...
03/01/2015 VOD 66 ...
03/01/2015 VOD 7 ...
04/01/2015 VOD 19 ...
04/01/2015 VOD 7 ...
05/01/2015 VOD 3 ...
06/01/2015 VOD 7 ...
..... ... ... ...
01/01/2015 RBS 58 ...
01/01/2015 RBS 445 ...
01/01/2015 RBS 44 ...
03/01/2015 RBS 57 ...
我需要根据 (-3,+3) 个交易日计算出每只股票的移动平均/标准偏差.
I need to work out the moving average/std deviation for each stock based on (-3,+3) trading days.
由于这些是交易日(不是日历日),并且每天有不同数量的交易,我创建了一个子查询并应用了以下代码.
Since those are trading days (not calendar days), and there are different number of trades in each day, I created a sub-query and applied the following code.
data want;
set input;
by date;
retain gdate;
if first.date then gdate+1;
run;
proc sort data=want; by stock gdate ; run;
proc sql;
create table want1 as
select
h.stock,
h.date,
h.value,
( select std(s.value) from want s
where h.gdate between s.gdate-2 and s.gdate+2) as std
from
want h
group by stock;
quit;
我尝试了按股票分组
.然而,代码忽略了股票组,只给了我整个时期的移动标准.我需要移动标准用于不同的股票.
I tried group by stock
. However, the code ignored the stock group and only gave me the moving std of the whole period. I need the moving std for different stocks.
谁能给我一些想法?谢谢!
Anyone can give me some idea ? Thanks !
推荐答案
让您熟悉PROC EXPAND
!它将成为您在时间序列中最好的新朋友.
Let's get you familiarized with PROC EXPAND
! It's going to be your new best friend in time series.
PROC EXPAND
允许您执行 基本上所有常见的转换(甚至你不知道存在的转换).
首先回答您的问题:
步骤 1:将所有价值合并为每只股票的一个交易日
proc sql noprint;
create table have2 as
select date, stock, sum(value) as total_value
from have
group by stock, date
order by stock, date;
quit;
第 2 步:使用 PROC EXPAND 计算以 +/- 3 天为中心的移动标准差
proc expand data=have2
out=want;
id date;
by stock;
convert total_value = standard_deviation / transform=(cmovstd 7);
run;
第 3 步:合并回原始表
proc sort data=have;
by stock date;
run;
data want2;
merge have
want;
by stock date;
run;
说明
我们正在利用按组处理和现有程序为我们完成大部分工作.由于语言的设计方式,SAS 喜欢通常不喜欢向前看,而 PROC EXPAND
是极少数能够在没有大量额外工作的情况下向前看数据的过程之一.此过程的另一个好处是,如果时间序列中存在间隙,它也不会中断,因此您可以对任何类型的顺序数据执行操作.
We are exploiting the use of by-group processing and an existing procedure to do the bulk of the work for us. SAS likes doesn't normally like to look forward due to how the language was designed, and PROC EXPAND
is one of the very few procedures that is able to look forward in data without a lot of extra work. Another bonus of this procedure is that it doesn't break if there are gaps in the time series, so you can perform operations on any kind of sequential data.
其中一个转换操作 cmovstd
将为我们在数据上应用一个居中的移动标准差,以实现为移动标准差收集未来的数据.请注意,我们选择了 7 的窗口以获得 +/- 3 天为中心的移动标准偏差.那是因为我们需要:
One of the transformation operations, cmovstd
, will apply a centered moving standard deviation on the data for us in order to achieve gathering future data for the moving standard deviation. Note that we chose a window of 7 to get a +/- 3 day centered moving standard deviation. That is because we need:
3 past days | +3
current day | +1
3 future days | +3
| 7 = window size
或者,在我们的窗口中总共有 7 天.如果您想要 +/- 2 天的中心移动标准差,您的窗口将为 5:
Or, a total of 7 days in our window. If you wanted a +/- 2 day centered moving standard deviation, your window would be 5:
2 past days | +2
current day | +1
2 future days | +2
| 5 = window size
如果您选择偶数,您将有 1 天或更多的滞后天数使窗口选择有效.例如,4 的窗口将产生:
If you choose an even number, you will have 1 or more lagged days to make the window choice valid. For example, a window of 4 will yield:
2 past days | +2
current day | +1
1 future day | +1
| 4 = window size
PROC EXPAND
就像时间序列的瑞士军刀.它将在一个步骤中在时间段之间进行插值、外推、变换和转换.您可能会发现它在以下情况下最有用:
PROC EXPAND
is like the Swiss Army knife for time series. It will interpolate, extrapolate, transform, and convert between time periods all in one step. You may find it most useful in the following situations:
1.应用移动(平均值、标准等)
proc expand data=have
out=want;
<by variable(s)>;
id <date variable>;
convert <numeric variable(s)> = <new variable name> / transform=(<operation> <window>);
run;
2.填补时间空白
proc expand data=have
out=want
to=<day, month, year, etc.>;
<by variable(s)>;
id date;
convert <numeric variable(s)> </method=<interpolation method> >;
run;
这篇关于SAS:不固定滚动窗口的标准偏差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!