问题描述
我有这个code在R:
I have this code in R :
corr = function(x, y) {
sx = sign(x)
sy = sign(y)
cond_a = sx == sy && sx > 0 && sy >0
cond_b = sx < sy && sx < 0 && sy >0
cond_c = sx > sy && sx > 0 && sy <0
cond_d = sx == sy && sx < 0 && sy < 0
cond_e = sx == 0 || sy == 0
if(cond_a) return('a')
else if(cond_b) return('b')
else if(cond_c) return('c')
else if(cond_d) return('d')
else if(cond_e) return('e')
}
它的作用是用来与R中的 mapply
功能相结合,以计算所有可能的标志图案present时间序列。在这种情况下,图案具有2的长度和所有可能的元组是:(+,+)(+, - )( - ,+)( - , - )
Its role is to be used in conjunction with the mapply
function in R in order to count all the possible sign patterns present in a time series. In this case the pattern has a length of 2 and all the possible tuples are : (+,+)(+,-)(-,+)(-,-)
我用的是科尔
的功能是这样的:
I use the corr
function this way :
> with(dt['AAPL'], table(mapply(corr, Return[-1], Return[-length(Return)])) /length(Return)*100)
a b c d e
24.6129416 25.4466058 25.4863041 24.0174672 0.3969829
> dt["AAPL",list(date, Return)]
symbol date Return
1: AAPL 2014-08-29 -0.3499903
2: AAPL 2014-08-28 0.6496702
3: AAPL 2014-08-27 1.0987923
4: AAPL 2014-08-26 -0.5235654
5: AAPL 2014-08-25 -0.2456037
我想概括科尔
函数 N
参数。这意味着,每 N
我会记下所有对应于所有可能的n元组的条件。目前我能想到的这样做的最好的事情就是让一个python脚本编写使用循环的code字符串,但必须有一个方法来正确地做到这一点。您对我怎么能概括的挑剔条件写一个想法,也许我可以尝试使用 expand.grid
但怎么做匹配呢?
I would like to generalize the corr
function to n
arguments. This mean that for every n
I would have to write down all the conditions corresponding to all the possible n-tuples. Currently the best thing I can think of for doing that is to make a python script to write the code string using loops, but there must be a way to do this properly. Do you have an idea about how I could generalize the fastidious condition writing, maybe I could try to use expand.grid
but how do the matching then ?
推荐答案
我觉得你最好使用 rollapply(...)
在动物园
包这一点。既然你似乎可以用 quantmod
反正(其中负荷 XTS
和动物园
),在这里是不使用所有这些嵌套如果(...)
语句的解决方案。
I think you're better off using rollapply(...)
in the zoo
package for this. Since you seem to be using quantmod
anyway (which loads xts
and zoo
), here is a solution that does not use all those nested if(...)
statements.
library(quantmod)
AAPL <- getSymbols("AAPL",auto.assign=FALSE)
AAPL <- AAPL["2007-08::2009-03"] # AAPL during the crash...
Returns <- dailyReturn(AAPL)
get.patterns <- function(ret,n) {
f <- function(x) { # identifies which row of `patterns` matches sign(x)
which(apply(patterns,1,function(row)all(row==sign(x))))
}
returns <- na.omit(ret)
patterns <- expand.grid(rep(list(c(-1,1)),n))
labels <- apply(patterns,1,function(row) paste0("(",paste(row,collapse=","),")"))
result <- rollapply(returns,width=n,f,align="left")
data.frame(100*table(labels[result])/(length(returns)-(n-1)))
}
get.patterns(Returns,n=2)
# Var1 Freq
# 1 (-1,-1) 22.67303
# 2 (-1,1) 26.49165
# 3 (1,-1) 26.73031
# 4 (1,1) 23.15036
get.patterns(Returns,n=3)
# Var1 Freq
# 1 (-1,-1,-1) 9.090909
# 2 (-1,-1,1) 13.397129
# 3 (-1,1,-1) 14.593301
# 4 (-1,1,1) 11.722488
# 5 (1,-1,-1) 13.636364
# 6 (1,-1,1) 13.157895
# 7 (1,1,-1) 12.200957
# 8 (1,1,1) 10.765550
基本的想法是创建一个模式
矩阵 2的n次方
行n列,每行重的可能的模式presents酮(E,G,(1,1),(-1,1),等等)。然后通过正明智使用 rollapply(...)
来此功能的日收益,并确定其中的模式行
比赛号(X)
完全吻合。然后用行号的该向量的一个索引标签
,其中包含一个字符重的模式presentation,然后用表(。 ..)
像你一样。
The basic idea is to create a patterns
matrix with 2^n
rows and n columns, where each row represents one of the possible patterns (e,g, (1,1), (-1,1), etc.). Then pass the daily returns to this function n-wise using rollapply(...)
and identify which row in patterns
matches sign(x)
exactly. Then use this vector of row numbers an an index into labels
, which contains a character representation of the patterns, then use table(...)
as you did.
这是一般的,对于n天的模式,但它忽略情况下,任何的回报是完全为零,因此 $频率
列加起来还不到100。你可以看到,这不会经常发生。
This is general for an n-day pattern, but it ignores situations where any return is exactly zero, so the $Freq
columns do not add up to 100. As you can see, this doesn't happen very often.
有趣的是,甚至崩溃是(非常轻微)更可能有两个向上天连续,超过两落天期间。如果你看看剧情(CL(AAPL))
在此期间,你可以看到,这是一个pretty的疯狂之旅。
It's interesting that even during the crash it was (very slightly) more likely to have two up days in succession, than two down days. If you look at plot(Cl(AAPL))
during this period, you can see that it was a pretty wild ride.
这篇关于如何推广这个算法(符号模式匹配计数器)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!