我想使用 optimParallel 优化 R 包中的函数.到目前为止,我只优化了我在我的环境中编写的函数并且它有效.但是任何包中的函数都不起作用,我得到一个错误.我使用 .libPaths() 检查每个节点上的路径是否相同,并使用 Sys.info() 检查任何差异.这是一个例子(没有意义,但应该说明我的问题)

I want to optimize a function from a package in R using optimParallel. Till now I only optimized functions that I wrote in my environment and it worked. But functions from any package don't work and I get a Error. I checked with .libPaths() if the paths are the same on each node and I used Sys.info() to check for any differences. Here is an example (which is not meaningful, but it should show my problem)


[1] "C:/Users/Name/Documents/R/win-library/3.5" "C:/Program Files/R/R-3.5.1/library"       

cl <- makeCluster(2) #also tried to set "master" to my IP
clusterEvalQ(cl, .libPaths())
optimParallel(par=0, dnorm, mean=1, method = "L-BFGS-B")$par
Error in checkForRemoteErrors(val) : 
   2 nodes produced errors; first error: object 'C_dnorm' not found

#for comparison 
optim(par=0, dnorm, mean=1, method = "L-BFGS-B")$par
[1] -5.263924



问题在 optimParallel 0.7-4 版本中解决

该版本在 CRAN 上可用:https://CRAN.R-project.org/package=optimParallel

解决方法是将 dnorm() 包装到 .GlobalEnv 中定义的函数中.

A workaround is to wrap dnorm() into a function defined in the .GlobalEnv.

cl <- makeCluster(2) 
f <- function(x, mean) dnorm(x, mean=mean)
optimParallel(par=0, f, mean=1, method="L-BFGS-B")$par
[1] -5.263924


A more difficult task is to explain why the problem occurs:

  • optimParallel() 使用 parallel::parLapply() 来计算 f.
  • parLapply() 有参数 clXfun.
  • 如果我们使用 parLapply() 而不预处理通过 optimParallel()... 传递的参数,f 不能有名为 clXfun 的参数,因为这会导致如下错误:

  • optimParallel() uses parallel::parLapply() to evaluate f.
  • parLapply() has the arguments cl, X, fun.
  • If we would use parLapply() without pre-processing the arguments passed via ... of optimParallel(), f could not have arguments named cl, X, fun, because this would cause errors like:

Error in lapply(X = x, FUN = f, ...) (from #2) : 
formal argument "X" matched by multiple actual arguments

  • 简单地说,optimParallel() 通过从 f 中删除所有参数,将它们放入环境中并在其中评估 f 来避免此错误环境.
  • f 在另一个 R 包中定义并链接到编译代码时,会出现这种方法的一个问题.上面的问题说明了这种情况.
  • Simply speaking, optimParallel() avoids this error by removing all arguments from f, putting them into an environment and evaluating f in that environment.
  • One problem of that approach occurs when f is defined in another R package and links to compiled code. That case is illustrated in the question above.
  • 欢迎提出更好的方法来处理这个问题.我在此处打开了一个相应的问题.只要没有更好的解决方案,就可以使用上述解决方法.

    Suggestions for better approaches to handle the issue are welcome. I opened a corresponding question here. As long as there is no better solution, one can use the workaround illustrated above.

