本文介绍了foreach %dopar% 比 for 循环慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
为什么 foreach()
和 %dopar%
比 for
慢.一些小例子:
Why foreach()
with %dopar%
slower than for
. Some litle exmaple:
library(parallel)
library(foreach)
library(doParallel)
registerDoParallel(cores = detectCores())
I <- 10^3L
for.loop <- function(I) {
out <- double(I)
for (i in seq_len(I))
out[i] <- sqrt(i)
out
}
foreach.do <- function(I) {
out <- foreach(i = seq_len(I), .combine=c) %do%
sqrt(i)
out
}
foreach.dopar <- function(I) {
out <- foreach(i = seq_len(I), .combine=c) %dopar%
sqrt(i)
out
}
identical(for.loop(I), foreach.do(I), foreach.dopar(I))
## [1] TRUE
library(rbenchmark)
benchmark(for.loop(I), foreach.do(I), foreach.dopar(I))
## test replications elapsed relative user.self sys.self user.child sys.child
## 1 for.loop(I) 100 0.696 1.000 0.690 0.000 0.0 0.000
## 2 foreach.do(I) 100 121.096 173.989 119.463 0.056 0.0 0.000
## 3 foreach.dopar(I) 100 120.297 172.841 111.214 6.400 3.5 6.734
一些附加信息:
sessionInfo()
## R version 3.0.0 (2013-04-03)
## Platform: x86_64-unknown-linux-gnu (64-bit)
##
## locale:
## [1] LC_CTYPE=ru_RU.UTF-8 LC_NUMERIC=C LC_TIME=ru_RU.UTF-8
## [4] LC_COLLATE=ru_RU.UTF-8 LC_MONETARY=ru_RU.UTF-8 LC_MESSAGES=ru_RU.UTF-8
## [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=ru_RU.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] doMC_1.3.0 rbenchmark_1.0.0 doParallel_1.0.1 iterators_1.0.6 foreach_1.4.0 plyr_1.8
##
## loaded via a namespace (and not attached):
## [1] codetools_0.2-8 compiler_3.0.0 tools_3.0.0
getDoParWorkers()
## [1] 4
推荐答案
特别提到并举例说明,确实有时设置它会比较慢,因为必须将来自包中单独并行进程的结果组合起来做并行.
It is specifically mentioned and illustrated with examples that indeed sometimes it's slower to set this up, because of having to combine the results from the separate parallel processes in the package doParallel.
参考:http://cran.r-project.org/web/packages/doParallel/vignettes/gettingstartedParallel.pdf
第 3 页:
对于小任务,调度任务和返回的开销结果可能大于执行任务本身的时间,导致性能不佳.
我通过示例发现,在某些情况下,使用包会导致执行代码所需的时间缩短 50%.
I used the example to find out that in some case, using the package resulted in 50% the time needed to execute the code.
这篇关于foreach %dopar% 比 for 循环慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!