问题描述
现在,我有了一个内置的数据集虹膜梳.到目前为止,我已被引导去能够找到这对值的lm()的系数.
Right now, I have a combn from the built in dataset iris. So far, I have been guided into being able to find the coefficient of lm() of the pair of values.
myPairs <- combn(names(iris[1:4]), 2)
formula <- apply(myPairs, MARGIN=2, FUN=paste, collapse="~")
model <- lapply(formula, function(x) lm(formula=x, data=iris)$coefficients[2])
model
但是,我想更进一步,并使用lm()中的系数来进行进一步的计算.我想做这样的事情:
However, I would like to go a few steps further and use the coefficient from lm() to be used in further calculations. I would like to do something like this:
Coefficient <- lm(formula=x, data=iris)$coefficients[2]
Spread <- myPairs[1] - coefficient*myPairs[2]
library(tseries)
adf.test(Spread)
该过程本身很简单,但是我无法找到一种方法来对数据集中的每个组合进行此操作. (作为附带说明,adf.test不会应用于此类数据,但我仅使用虹膜数据集进行演示).我想知道,为这样的过程编写一个循环会更好吗?
The procedure itself is simple enough, but I haven't been able to find a way to do this for each combn in the data set. (As a sidenote, the adf.test would not be applied to such data, but I'm just using the iris dataset for demonstration).I'm wondering, would it be better to write a loop for such a procedure?
推荐答案
您可以在combn
中完成所有这些操作.
You can do all of this within combn
.
如果您只是想对所有组合进行回归分析,然后提取第二个系数就可以了
If you just wanted to run the regression over all combinations, and extract the second coefficient you could do
fun <- function(x) coef(lm(paste(x, collapse="~"), data=iris))[2]
combn(names(iris[1:4]), 2, fun)
然后您可以扩展该函数以计算点差
You can then extend the function to calculate the spread
fun <- function(x) {
est <- coef(lm(paste(x, collapse="~"), data=iris))[2]
spread <- iris[,x[1]] - est*iris[,x[2]]
adf.test(spread)
}
out <- combn(names(iris[1:4]), 2, fun, simplify=FALSE)
out[[1]]
# Augmented Dickey-Fuller Test
#data: spread
#Dickey-Fuller = -3.879, Lag order = 5, p-value = 0.01707
#alternative hypothesis: stationary
比较结果以手动运行第一个
Compare results to running the first one manually
est <- coef(lm(Sepal.Length ~ Sepal.Width, data=iris))[2]
spread <- iris[,"Sepal.Length"] - est*iris[,"Sepal.Width"]
adf.test(spread)
# Augmented Dickey-Fuller Test
# data: spread
# Dickey-Fuller = -3.879, Lag order = 5, p-value = 0.01707
# alternative hypothesis: stationary
这篇关于R:如何从数据集的组合中执行更复杂的计算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!