我有五个需要合并的大约 60 列的数据框。它们每个都有相同的列,我将它们与它们的平均值相结合,因为它们代表相同的值。问题不在于将它们结合起来的能力,而在于有效地结合起来。这是示例数据/代码:

#reproducible random data
set.seed(123)

dat1 <- data.frame( a = rnorm(16), b = rnorm(16), c = rnorm(16), d = rnorm(16), e = rnorm(16), f = rnorm(16))
dat2 <- data.frame( a = rnorm(16), b = rnorm(16), c = rnorm(16), d = rnorm(16), e = rnorm(16), f = rnorm(16))
dat3 <- data.frame( a = rnorm(16), b = rnorm(16), c = rnorm(16), d = rnorm(16), e = rnorm(16), f = rnorm(16))

#This works but is inefficient

final_data<-data.frame(a=rowMeans(cbind(dat1$a,dat2$a,dat3$a)),
                       b=rowMeans(cbind(dat1$b,dat2$b,dat3$b)),
                       c=rowMeans(cbind(dat1$c,dat2$c,dat3$c)),
                       d=rowMeans(cbind(dat1$d,dat2$d,dat3$d)),
                       e=rowMeans(cbind(dat1$e,dat2$e,dat3$e)),
                       f=rowMeans(cbind(dat1$f,dat2$f,dat3$f))
)
#what results should look like
head(final_data)
#             a           b          c           d            e           f
# 1 0.573813625  0.17695841 -0.1434628 -0.53673101  0.353906578  0.24262067
# 2 0.135689926 -0.69206908  0.2888584 -0.37215810 -0.038298083 -0.23317107
# 3 0.004068807  0.44666945  0.5205118  0.09587453 -0.308528454  0.30516883
# 4 0.347100292  0.02401646  0.1409754 -0.15931120  0.587047386 -0.08684867
# 5 0.006529998  0.09010946  0.4932670  0.62606230 -0.005235813 -0.36967000
# 6 0.240225778 -0.45824825 -0.5000004  0.66131121  0.619480608  0.55650611

这里的问题是我不想为新数据框中的 60 列中的每一列重写 a=rowMeans(cbind(dat1$a,dat2$a,dat3$a))。你能想出一个好的方法来解决这个问题吗?

编辑:我将接受以下答案,因为它允许我设置列以应用它 -
final_data1<-as.data.frame(sapply(colnames(dat1),function(i)
    rowMeans(cbind(dat1[,i],dat2[,i],dat3[,i]))))

> identical(final_data1,final_data)
[1] TRUE

最佳答案

试试这个:

sapply(colnames(dat1),function(i)
  rowMeans(cbind(dat1[,i],dat2[,i],dat3[,i])))

关于r - 跨数据帧均值的新数据帧,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/29905250/

10-12 17:12
查看更多