问题描述
我正在制作一个函数(myFUN),该函数一次调用parallel :: parApply,并提供一个函数yourFUN作为参数.
在许多情况下,yourFUN将包含来自全局环境的自定义函数.
因此,尽管我可以将"yourFUN"传递给parallel :: clusterExport,但我无法事先知道其中的函数名称,而clusterExport因为找不到它们而向我返回错误.
我不想导出yourFUN的整个封闭环境,因为它可能很大.
我是否可以仅导出运行yourFUN所需的变量?
实际功能很长,下面是该错误的最小示例:
mydata <- matrix(data = 1:9, 3, 3)
perfFUN <- function(x) 2*x
opt_perfFUN <- function(y) max(perfFUN(y))
avg_perfFUN <- function(w) perfFUN(mean(w))
myFUN <- function(data, yourFUN, n_cores = 1){
cl <- parallel::makeCluster(n_cores)
parallel::clusterExport(cl, varlist = c("yourFUN"), envir = environment())
parallel::parApply(cl, data, 1, yourFUN)
}
myFUN(data = mydata, yourFUN = opt_perfFUN)
myFUN(data = mydata, yourFUN = avg_perfFUN)
Error in checkForRemoteErrors(val) : one node produced an error: could not find function "perfFUN"
非常感谢您!
一种可能的解决方案,请使用:
myFUN <- function(data, yourFUN, n_cores = 1) {
cl <- parallel::makeCluster(n_cores)
on.exit(parallel::stopCluster(cl), add = TRUE)
envir <- environment(yourFUN)
parallel::clusterExport(cl, varlist = ls(envir), envir = envir)
parallel::parApply(cl, data, 1, yourFUN)
}
I'm making a function (myFUN) that calls parallel::parApply at one point, with a function yourFUN that is supplied as an argument.
In many situations, yourFUN will contain custom functions from the global environment.
So, while I can pass "yourFUN" to parallel::clusterExport, I cannot know the names of functions inside it beforehand, and clusterExport returns me an error because it cannot find them.
I don't want to export the whole enclosing environment of yourFUN, since it might be very big.
Is there a way for me to export only the variables necessary for running yourFUN?
The actual function is very long, here is a minimized example of the error:
mydata <- matrix(data = 1:9, 3, 3)
perfFUN <- function(x) 2*x
opt_perfFUN <- function(y) max(perfFUN(y))
avg_perfFUN <- function(w) perfFUN(mean(w))
myFUN <- function(data, yourFUN, n_cores = 1){
cl <- parallel::makeCluster(n_cores)
parallel::clusterExport(cl, varlist = c("yourFUN"), envir = environment())
parallel::parApply(cl, data, 1, yourFUN)
}
myFUN(data = mydata, yourFUN = opt_perfFUN)
myFUN(data = mydata, yourFUN = avg_perfFUN)
Error in checkForRemoteErrors(val) : one node produced an error: could not find function "perfFUN"
Thank you very much!
A possible solution, use:
myFUN <- function(data, yourFUN, n_cores = 1) {
cl <- parallel::makeCluster(n_cores)
on.exit(parallel::stopCluster(cl), add = TRUE)
envir <- environment(yourFUN)
parallel::clusterExport(cl, varlist = ls(envir), envir = envir)
parallel::parApply(cl, data, 1, yourFUN)
}
这篇关于parallel :: clusterExport如何从全局环境传递嵌套函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!