本文介绍了循环以使用ifelse添加新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想提高代码效率,我进行了一项调查,数据看起来像这样:

I would like to make my code more efficient, I have a survey where my data looks like:

survey <- data.frame(
                     x = c(1, 6, 2, 60, 75, 40, 27, 10),
                     y = c(100, 340, 670, 700, 450, 200, 136, 145))

#Two lists:
A <- c(3, 6, 7, 27, 40, 41)
t <- c(0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16)

我所做的是创建新列,如下所示:

What I did was create new columns, like this:

z <- ifelse(survey$x %in% A), 0, min(t))

for (i in t) {
  survey[paste0("T",i)] <-z
  survey[paste0("T",i)] <-ifelse (z > 0, i, z)
}

但是使用该代码需要一段时间,有没有更好的方法呢?

But with that code it takes a while, is there a better way to do it?

推荐答案

正如OP中提到的执行速度,data.table选项会更快

As the OP mentioned about speed of execution, the data.table option would be faster

library(data.table)
i1 <- !survey$x %in% A

setDT(survey)[, paste0("T", t) := 0]
for(j in t) {
    set(survey2, i = which(i1), j = paste0("T", j), value = j)
    }

基准

set.seed(24)
survey1 <- data.frame(x = sample(survey$x, 1e7, replace = TRUE),
       y = sample(survey$y, 1e7, replace = TRUE))

survey2 <- copy(survey1)

system.time({

survey1[paste0("T", t)] <- lapply(t, function(y) ifelse(survey1$x %in% A, 0, y))
})
# user  system elapsed
#   8.20    2.75   11.03

system.time({
i1 <- !survey2$x %in% A

setDT(survey2)[, paste0("T", t) := 0]
for(j in t) {
     set(survey2, i = which(i1), j = paste0("T", j), value = j)
        }

})
# user  system elapsed
#   0.97    0.31    1.28

这篇关于循环以使用ifelse添加新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-16 08:12