本文介绍了套用至两个资料框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用R,并且有两个data.frames,AB.它们都具有6行,但是A具有25000列(基因),而B具有30列.我想应用带有两个参数f(x,y)的函数,其中xA的每一列,而yB的每一列.到目前为止,看起来像这样:

I'm using R, and I have two data.frames, A and B. They both have 6 rows, but A has 25000 columns (genes), and B has 30 columns. I'd like to apply a function with two arguments f(x,y) where x is every column of A and y is every column of B. So far it looks like this:

i = 1
for (x in A){
    j = 1
    for (y in B){
        out[i,j] <- f(x,y)
        j = j + 1
    }
    i = i + 1
}

我对此有两个问题:从我的Python编程中,我联想到跟踪这样的计数器很麻烦,而从我的R编程中,我对for循环感到不安.但是,我不太清楚如何应用apply(甚至我应该应用apply)来解决这个问题,并希望有人能启发我.我现在需要将f()视为原子(实际上是cor.test()).

I have two issues with this: from my Python programming I associate keeping track of counters like this as crufty, and from my R programming I am nervous of for loops. However, I can't quite see how to apply apply (or even if I should apply apply) to this problem and was hoping someone might enlighten me. I need to treat f() as atomic (it's actually cor.test()) for now.

推荐答案

由于您正在使用数据帧,因此使用lapply或sapply进行此操作可能会更快(特别是考虑到数据帧的范围).例如

Since you are using data frames, it might be faster to use lapply or sapply to do this (specially given the scope of your data frames). For example,

x <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8), col3=c(9,10,11,12))
y <- data.frame(col1=c(1,2,3,4), col2=c(5,6,7,8))
bl <- lapply(x, function(u){
   lapply(y, function(v){
       f(u,v) # Function with column from x and column from y as inputs
   })
})
out = matrix(unlist(bl), ncol=ncol(y), byrow=T)

这篇关于套用至两个资料框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 19:58