从R中的数据框运行多个简单的线性回归

从R中的数据框运行多个简单的线性回归

本文介绍了从R中的数据框运行多个简单的线性回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集(数据框),其中有5列均包含数字值.

I have a dataset (data frame) with 5 columns all containing numeric values.

我希望对数据集中的每一对进行简单的线性回归.

I'm looking to run a simple linear regression for each pair in the dataset.

例如,如果列名为A, B, C, D, E,则我要运行lm(A~B), lm(A~C), lm(A~D), ...., lm(D~E),...,然后,我要绘制每对数据以及回归线.

For example, If the columns were named A, B, C, D, E, I want to run lm(A~B), lm(A~C), lm(A~D), ...., lm(D~E),... and, then I want to plot the data for each pair along with the regression line.

我对R还是很陌生,所以我对如何实际实现此目标感到有些困惑.我应该使用ddply吗?还是lapply?我不确定如何解决这个问题.

I'm pretty new to R so I'm sort of spinning my wheels on how to actually accomplish this. Should I use ddply? or lapply? I'm not really sure how to tackle this.

推荐答案

以下是使用combn

 combn(names(DF), 2, function(x){lm(DF[, x])}, simplify = FALSE)

示例:

set.seed(1)
DF <- data.frame(A=rnorm(50, 100, 3),
                 B=rnorm(50, 100, 3),
                 C=rnorm(50, 100, 3),
                 D=rnorm(50, 100, 3),
                 E=rnorm(50, 100, 3))

更新:添加@Henrik建议(请参阅评论)

Updated: adding @Henrik suggestion (see comments)

# only the coefficients
> results <- combn(names(DF), 2, function(x){coefficients(lm(DF[, x]))}, simplify = FALSE)
> vars <- combn(names(DF), 2)
> names(results) <- vars[1 , ] # adding names to identify variables in the reggression
> results
$A
 (Intercept)            B
103.66739418  -0.03354243

$A
(Intercept)           C
97.88341555  0.02429041

$A
(Intercept)           D
122.7606103  -0.2240759

$A
(Intercept)           E
99.26387487  0.01038445

$B
 (Intercept)            C
99.971253525  0.003824755

$B
 (Intercept)            D
102.65399702  -0.02296721

$B
(Intercept)           E
96.83042199  0.03524868

$C
(Intercept)           D
 80.1872211   0.1931079

$C
(Intercept)           E
 89.0503893   0.1050202

$D
 (Intercept)            E
107.84384655  -0.07620397

这篇关于从R中的数据框运行多个简单的线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 17:14