内部函数中的parLapply意外将数据复制到节点

本文介绍了内部函数中的parLapply意外将数据复制到节点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个很大的列表(约30GB)，其功能如下:

I have a large list (~30GB) and functions as follows:

cl <- makeCluster(24, outfile = "")

Foo1 <- function(cl, largeList) {
  return(parLapply(cl, largeList, Bar))
}

Bar1 <- function(listElement) {
  return(nrow(listElement))
}

Foo2 <- function(cl, largeList, arg) {
  clusterExport(cl, list("arg"), envir = environment())
  return(parLapply(cl, largeList, function(x) Bar(x, arg)))
}

Bar2 <- function(listElement, arg) {
  return(nrow(listElement))
}

没有问题:

Foo1(cl, largeList)

观察每个进程的内存使用情况，我可以看到只有一个列表元素被复制到每个节点.

Watching the memory usage for each process I can see that only one list element is being copied to each node.

但是，在致电时:

Foo2(cl, largeList, 0)

largeList的副本正在复制到每个节点.逐步执行Foo2，不会在clusterExport上进行largeList复制，而是在parLapply上进行.另外，当我从全局环境(不在函数内)执行Foo2的主体时，也没有问题.是什么原因造成的?

a copy of largeList is being copied to each node. Stepping through Foo2, the largeList copying is not happening at clusterExport, but rather on parLapply. Also, when I execute the body of Foo2 from the global environment (not within a function), there are no issues. What is causing this?

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 21 (Twenty One)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  splines   stats     graphics  grDevices utils
[7] datasets  methods   base

other attached packages:
[1] xts_0.9-7           zoo_1.7-12          snow_0.3-13
[4] Rcpp_0.12.2         randomForest_4.6-12 gbm_2.1.1
[7] lattice_0.20-33     survival_2.38-3     e1071_1.6-7

loaded via a namespace (and not attached):
[1] class_7.3-13 tools_3.2.2  grid_3.2.2

内部函数中的parLapply意外将数据复制到节点

问题描述

推荐答案