问题描述
在Windows 7上的R 2.14.1中工作
Working in R 2.14.1, on Windows 7
在R中使用并行程序包,我试图利用网络上可用的本地计算机之外的内核,我连接到的所有远程主机都是相同的Windows计算机.
Using the package parallel in R, I'm trying to take advantage of cores outside of my local machine available on my network, where all remote hosts I am connecting to are identical Windows machines.
命令的基本形式就是建立连接.
The basic form of the commands are as such to make the connection.
library(parallel)
#assume 8 cores per machine
cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)))
当然,尝试调试这些东西可能很棘手,但这就是我要解决的问题.
Of course, trying to debug these things can be pretty tricky, but here is where I'm at with it.
如果我按如下所示指定manual = TRUE标志
If I specify the manual = TRUE flag as below
cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)), manual=TRUE)
连接到远程主机并运行并行进程没有问题.这些计算机的设置与我正在使用的设置相同.但是,如果未设置此手动标志,则连接命令会挂起.
there are no problems connecting to the remote host, and running a parallel process. The computers have identical setups to the one that I am working on. Yet, when this manual flag is not set, the connection command hangs.
这似乎向我表明,由于手动标志绕过ssh来建立与主机的连接,所以当manual = FALSE时,ssh就是问题.
This seems to indicate to me that since the manual flag bypasses ssh to make the connection to the host, that ssh is the problem when manual=FALSE.
目前不能保证远程计算机上装有ssh.问题是,考虑到我具有远程主机的所有相关Windows登录信息,并且无法更改远程计算机上的设置,我如何在不使用R 并行程序包的情况下连接到远程计算机上的内核? 是否指定manual = true?
It is not guaranteed at the moment that the remote computers have ssh on them. The question is, given that I have all the pertinent windows login information for my remote hosts, and that I cannot change the settings on the remote computers, how would I connect to cores on remote machines with the package parallel in R without specifying manual = true?
或者,如果必须安装ssh来完成此操作,则假定所有计算机上都装有ssh.如何在不绕过ssh的情况下连接到远程计算机 上的内核?
Alternatively, if ssh must be installed for this to happen, let's assume all computers have ssh on them. How would I connect to cores on the remote machines without circumventing ssh?
如果您需要更多信息,请告诉我,谢谢您.
If you need any more information please let me know, I appreciate the time.
8-26-14
感谢史蒂夫·韦斯顿的见解.我将提供有关使用的确切工具和设置的更新,以便在系统启动并运行时使其正常运行.
Thanks to Steve Weston for his insights. I will provide an update with the exact tools and setup I use to get my system working when it's up and running.
如果您有其他什么要添加的最佳途径,请随意发表评论或发表,这是通过makePSOCKcluster从Windows机器远程连接到Windows机器的最佳途径,其中手动标记设置为FALSE. /p>
Feel free to comment or post if you have anything else to add as to what may be the best route to go in remote connecting to a windows machine from a windows machine via makePSOCKcluster, where the manual flag is set to FALSE.
推荐答案
使用manual=FALSE
创建PSOCK群集时,在远程计算机上启动工作程序的唯一方法是使用"ssh","rsh"或类似的东西.命令行兼容,例如PuTTY中的"plink".原因是makePSOCKcluster使用系统"功能启动远程工作者以执行以下形式的命令:
When creating a PSOCK cluster with manual=FALSE
, the only way to start a worker on a remote machine is with "ssh", "rsh", or something command-line compatible, such as "plink" from PuTTY. The reason is that makePSOCKcluster starts the remote workers using the "system" function to execute commands of the form:
ssh -l user otherhost '/usr/lib/R/bin/Rscript' -e 'parallel:::.slaveRSOCK()' MASTER=myhost PORT=10187 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE
您可以通过从并行程序包中的snowSOCK.R文件中查看newPSOCKnode函数的源代码来确认这一点.
You can confirm this by looking at the source code for the newPSOCKnode function in the file snowSOCK.R from the parallel package.
为此,必须在本地计算机上使用ssh-compatible命令,并且必须在每台远程计算机上运行相应的ssh守护程序,否则makePSOCKcluster会简单地挂起.我发现在Windows上安装一个运行良好的ssh守护程序是困难的部分.
For this to work, the ssh-compatible command must be available on the local machine and a corresponding ssh daemon must be running on each of the remote machines, otherwise makePSOCKcluster will simply hang. I've found that installing a good, working ssh daemon is the difficult part on Windows.
不幸的是,manual=TRUE
通常是在多台Windows计算机上创建PSOCK群集的最简单方法.
Unfortunately, manual=TRUE
is generally the easiest way to create a PSOCK cluster on multiple Windows machines.
这篇关于R Parallel-连接到远程内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!