我在Linux上运行,并轻松使用了mclapply
。即使使用parlapply
,我也遇到clusterEvalQ
的一些错误。
在我进一步解决该问题之前,有什么要说的,即两者之间是否存在明显的速度差异,还是人们在Windows上仅使用parLapply
?
我已经读过parLapplyLB
并可以看到这种方法的用法,但是如果我严格地研究mclapply
和parlapply
,那么FORK方法和PSOCK方法的速度会有很大不同吗?
我的职能性质可能决定答案;它正在使用stri_extract
。
最佳答案
一些快速的基准测试表明mclapply
可能会稍快一些,但这可能取决于特定的系统和问题。作业越平衡,实际任务越慢,使用的功能就越不重要。
library(parallel)
library(microbenchmark)
microbenchmark(
parLapply = {cl <- makeCluster(2)
parLapply(cl, rep(1:7, 3), function(x) {set.seed(1); rnorm(10^x)})
stopCluster(cl)},
mclapply = {mclapply(rep(1:7 , 3), function(x) {set.seed(1); rnorm(10^x)}, mc.cores = 2)},
times = 10
)
#Unit: seconds
# expr min lq mean median uq max neval
#parLapply 1.85548 2.04397 3.332970 3.071284 4.323514 6.294364 10
#mclapply 1.62610 1.65288 2.217407 1.849594 2.243418 5.435189 10
microbenchmark(
parLapply = {cl <- makeCluster(2)
parLapply(cl, rep(6, 20), function(x) {set.seed(1); rnorm(10^x)})
stopCluster(cl)},
mclapply = {mclapply(rep(6, 20), function(x) {set.seed(1); rnorm(10^x)}, mc.cores = 2)},
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval
#parLapply 1150.657 1188.9750 1705.1364 1242.739 2071.276 3785.516 10
# mclapply 820.692 932.2262 994.4404 1000.402 1079.930 1117.863 10
sessionInfo()
#R version 3.3.1 (2016-06-21)
#Platform: x86_64-pc-linux-gnu (64-bit)
#Running under: Ubuntu 14.04.5 LTS
#
#locale:
# [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
# [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C
# [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
#
#attached base packages:
#[1] parallel stats graphics grDevices utils datasets methods base
#
#other attached packages:
#[1] microbenchmark_1.4-2.1 doParallel_1.0.10 iterators_1.0.8 foreach_1.4.3
#
#loaded via a namespace (and not attached):
# [1] colorspace_1.2-6 scales_0.4.0 plyr_1.8.4 tools_3.3.1 gtable_0.2.0 Rcpp_0.12.4
# [7] ggplot2_2.1.0 codetools_0.2-14 grid_3.3.1 munsell_0.4.3