R:如何从数据集的组合中执行更复杂的计算?

本文介绍了R:如何从数据集的组合中执行更复杂的计算?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

现在，我有了一个内置的数据集虹膜梳.到目前为止，我已被引导去能够找到这对值的lm()的系数.

Right now, I have a combn from the built in dataset iris. So far, I have been guided into being able to find the coefficient of lm() of the pair of values.

myPairs <- combn(names(iris[1:4]), 2)

formula <- apply(myPairs, MARGIN=2, FUN=paste, collapse="~")

model <- lapply(formula, function(x) lm(formula=x, data=iris)$coefficients[2])

model

但是，我想更进一步，并使用lm()中的系数来进行进一步的计算.我想做这样的事情:

However, I would like to go a few steps further and use the coefficient from lm() to be used in further calculations. I would like to do something like this:

Coefficient <- lm(formula=x, data=iris)$coefficients[2]
Spread <- myPairs[1] - coefficient*myPairs[2]
library(tseries)
adf.test(Spread)

该过程本身很简单，但是我无法找到一种方法来对数据集中的每个组合进行此操作. (作为附带说明，adf.test不会应用于此类数据，但我仅使用虹膜数据集进行演示).我想知道，为这样的过程编写一个循环会更好吗?

The procedure itself is simple enough, but I haven't been able to find a way to do this for each combn in the data set. (As a sidenote, the adf.test would not be applied to such data, but I'm just using the iris dataset for demonstration).I'm wondering, would it be better to write a loop for such a procedure?

推荐答案

您可以在combn中完成所有这些操作.

You can do all of this within combn.

如果您只是想对所有组合进行回归分析，然后提取第二个系数就可以了

If you just wanted to run the regression over all combinations, and extract the second coefficient you could do

fun <- function(x) coef(lm(paste(x, collapse="~"), data=iris))[2]
combn(names(iris[1:4]), 2, fun)

然后您可以扩展该函数以计算点差

You can then extend the function to calculate the spread

fun <- function(x) {
         est <- coef(lm(paste(x, collapse="~"), data=iris))[2]
         spread <- iris[,x[1]] - est*iris[,x[2]]
         adf.test(spread)
        }

out <- combn(names(iris[1:4]), 2, fun, simplify=FALSE)
out[[1]]

#   Augmented Dickey-Fuller Test

#data:  spread
#Dickey-Fuller = -3.879, Lag order = 5, p-value = 0.01707
#alternative hypothesis: stationary

比较结果以手动运行第一个

Compare results to running the first one manually

est <- coef(lm(Sepal.Length ~ Sepal.Width, data=iris))[2]
spread <- iris[,"Sepal.Length"] - est*iris[,"Sepal.Width"]
adf.test(spread)

#   Augmented Dickey-Fuller Test

# data:  spread
# Dickey-Fuller = -3.879, Lag order = 5, p-value = 0.01707
# alternative hypothesis: stationary

这篇关于R:如何从数据集的组合中执行更复杂的计算?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！