问题描述
我使用的是 R,我有两个 data.frames,A
和 B
.它们都有 6 行,但 A
有 25000 列(基因),而 B
有 30 列.我想应用一个带有两个参数的函数 f(x,y)
其中 x
是 A
和 y 的每一列
是 B
的每一列.目前看起来是这样的:
I'm using R, and I have two data.frames, A
and B
. They both have 6 rows, but A
has 25000 columns (genes), and B
has 30 columns. I'd like to apply a function with two arguments f(x,y)
where x
is every column of A
and y
is every column of B
. So far it looks like this:
i = 1
for (x in A){
j = 1
for (y in B){
out[i,j] <- f(x,y)
j = j + 1
}
i = i + 1
}
我对此有两个问题:从我的 Python 编程中,我认为跟踪这样的计数器很笨拙,而从我的 R 编程中,我对 for 循环感到紧张.但是,我不太明白如何将 apply
(或者即使我应该应用 apply
)应用于这个问题,并希望有人能启发我.我现在需要将 f()
视为原子(实际上是 cor.test()
).
I have two issues with this: from my Python programming I associate keeping track of counters like this as crufty, and from my R programming I am nervous of for loops. However, I can't quite see how to apply apply
(or even if I should apply apply
) to this problem and was hoping someone might enlighten me. I need to treat f()
as atomic (it's actually cor.test()
) for now.
推荐答案
由于您使用的是数据框,因此使用 lapply 或 sapply 执行此操作可能会更快(特别是考虑到您的数据框的范围).例如,
Since you are using data frames, it might be faster to use lapply or sapply to do this (specially given the scope of your data frames). For example,
x <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), col3=c(9,10,11,12))
y <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8))
bl <- lapply(x, function(u){
lapply(y, function(v){
f(u,v) # Function with column from x and column from y as inputs
})
})
out = matrix(unlist(bl), ncol=ncol(y), byrow=T)
这篇关于应用两个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!