r - 将相同的函数应用于 R 中的多个数据集

我是 R 新手，我有 25 个 RNAseq 结果样本。我想应用相同的函数来计算我的目标基因(比如基因 ABC)与所有 25 个样本的相关性。

我知道如何单独执行此操作。这是我的代码:

df <- read.table("Sample1.txt", header=T, sep="\t")

# gene expression values of interest
gene <-as.numeric(df["ABC",])

# correlate gene with all others genes in the expression set
correlations <- apply(df,1,function(x){cor(gene,x)})

但现在我有 25 个。我使用 lapply 一次阅读它们。

data <- c("Sample1.txt", "Sample2.txt",..."Sample25.txt")
df <- lapply(data, read.table)
names(df) <- data

但是，我不知道如何将它与我上面的其余代码连接起来以计算基因相关性。我已经阅读了一些相关的线程，但仍然无法弄清楚。有人可以帮助我吗？谢谢!

最佳答案

你应该做:

files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")

myfunc <- function(file) {
  df <- read.table(file, header=TRUE, sep="\t")

  # gene expression values of interest
  gene <- as.numeric(df["ABC",])

  # correlate gene with all others genes in the expression set
  correlations <- apply(df, 1, function(x) cor(gene, x) )
}

lapply(files, myfunc)

这就是我为你推荐的风格。这是我会做的风格:

myfunc <- function(file) {
  df   <- read.table(file, header=TRUE, sep="\t")
  gene <- as.numeric(df["ABC",]) # gene expression values of interest
  apply(df, 1, FUN=cor, y=gene)  # correlate gene with all others
}

files <- c("Sample1.txt", "Sample2.txt", ..., "Sample25.txt")
lapply(files, myfunc)

可能您想将结果保存到一个对象中:

L <- lapply(files, myfunc)

对于一个甚至可以做的函数(因为 cor() 接受矩阵参数)):

myfunc <- function(file) {
  df <- read.table(file, header=TRUE, sep="\t")
  cor(t(df), y=as.numeric(df["ABC",])) # correlate gene with all others
}

关于r - 将相同的函数应用于 R 中的多个数据集，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/43417941/