问题描述
因此,我对我的工作流程有一组相当复杂的要求.我想使用主从拓扑和非默认工作目录.我也想将本地和远程工作人员混在一起.
So I have a moderately complex set of requirements for my worker processes.I want to use a the master slave topology, and a nondefault working directory.I also want to mix both local and remote workers.
据我所准备的文档.它不会让我那样做.
As far as I can tell from readying the --machine-file
section of the documentation.It will not let me do that.
所以我正在查看-L <file
参数
因此,如果我不使用-p
或--machine-file`标志,则最初只有一个处理器,因此所有处理器仅表示唯一的处理器.
So if I do not use the -p
or --machine-file` flags, then there is initially only one processer so the all processors just mean on the only processor.
所以我尝试了
addprocs([
("cluster_c4_1",:auto),
("cluster_c4_2",:auto)
],
dir="/mnt/",
topology=:master_slave
)
addprocs(
dir="/mnt/",
topology=:master_slave
)
test.jl
println("*************")
println(workers())
println("-------------")
运行它:
>julia -L start_workers.jl pl.jl
*************
[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
-------------
所以看起来不错,我有20个工人.我做了什么不合理的事情吗?这是最好的方法吗?
So it looks all good, got my 20 workers.Have I done anything unreasonable? Is this the best way?
推荐答案
这正是我在Torque调度程序下将其部署在HPC群集上的方式.实际上,在通过Torque调度系统添加进程时,我正在重新编写集群管理器以支持更多选项,因此,我花了很多时间对此进行研究.
That's exactly how I'm deploying it on a HPC cluster under Torque scheduler. In fact I'm in the process of re-writing the the cluster manager to support more options when adding processes through the Torque scheduling systems in particular, so I've spent quite a bit of time looking into this.
您可能还想知道有各种各样的ClusterManager,Pkg.add("ClusterManagers")
在各种环境下(例如,当您需要从调度程序请求资源时)扩展了addprocs的功能.看起来您可以使用无密码的ssh,因此默认的集群管理器就足够了.
You might also want to be aware there are various ClusterManagers, Pkg.add("ClusterManagers")
that extend the ability of addprocs under a variety of environments, such as when you need to request the resources from a scheduler. It looks like passwordless ssh is possible for you, so the default cluster manager is sufficient in your case.
我认为无法在命令行上定义任何额外的拓扑和目录参数,因此您的方法是正确的.
I don't believe there is any way of defining the extra topology and directory parameters on the command line, so your approach is correct.
这篇关于使用-L标志和addprocs脚本是否是-p和--machinefile的更强大版本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!