问题描述
如果事先不知道最终结果的数量,在 R 中的循环中收集结果的惯用方法是什么?这是一个玩具示例:
What is the idiomatic way to collect results in a loop in R if the number of final results is not known beforehand? Here's a toy example:
results = vector('integer')
i=1L
while (i < bigBigBIGNumber) {
if (someCondition(i)) results = c(results, i)
i = i+1
}
results
这个例子的问题是(我假设)它将具有二次复杂性,因为向量需要在每次追加时重新分配.(这是正确的吗?)我正在寻找避免这种情况的解决方案.
The problem with this example is that (I assume) it will have quadratic complexity as the vector needs to be re-allocated at every append. (Is this correct?) I'm looking for a solution that avoids this.
我找到了 Filter
,但它需要预先生成 1:bigBigBigBIGNumber
,我想避免使用它来节省内存.(问题:for (i in 1:N)
是否也预先生成 1:N
并将其保存在内存中?)
I found Filter
, but it requires pre-generating 1:bigBigBIGNumber
which I want to avoid to conserve memory. (Question: does for (i in 1:N)
also pre-generate 1:N
and keep it in memory?)
我可以像这样制作链表:
results = list()
i=1L
while (i < bigBigBIGNumber) {
if (someCondition(i)) results = list(results, i)
i = i+1
}
unlist(results)
(请注意,这不是串联.它正在构建一个类似于 list(list(list(1),2),3)
的结构,然后使用 unlist
进行扁平化.)
(Note that this is not concatenation. It's building a structure like list(list(list(1),2),3)
, then flattening with unlist
.)
还有比这更好的方法吗?通常使用的惯用方式是什么?(我对 R 很陌生.)我正在寻找有关如何解决此类问题的建议.非常欢迎关于紧凑(易于编写)和快速代码的建议!(但我想专注于速度和内存效率.)
Is there a better way than this? What is the idiomatic way that's typically used? (I am very new to R.) I'm looking for suggestion on how to tackle this type of problem. Suggestions both about compact (easy to write) and fast code are most welcome! (But I'd like to focus on fast and memory efficient.)
推荐答案
这是一种算法,它在输出列表填满时将其大小加倍,从而实现一些线性计算时间,如基准测试所示:
Here is an algorithm that doubles the size of the output list as it fills up, achieving somewhat linear computation times as show the benchmark tests:
test <- function(bigBigBIGNumber = 1000) {
n <- 10L
results <- vector("list", n)
m <- 0L
i <- 1L
while (i < bigBigBIGNumber) {
if (runif(1) > 0.5) {
m <- m + 1L
results[[m]] <- i
if (m == n) {
results <- c(results, vector("list", n))
n <- n * 2L
}
}
i = i + 1L
}
unlist(results)
}
system.time(test(1000))
# user system elapsed
# 0.008 0.000 0.008
system.time(test(10000))
# user system elapsed
# 0.090 0.002 0.093
system.time(test(100000))
# user system elapsed
# 0.885 0.051 0.936
system.time(test(1000000))
# user system elapsed
# 9.428 0.339 9.776
这篇关于在循环中收集未知数量的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!