问题描述
到目前为止,我只在Linux机器和虚拟机(桥接网络)上运行火花,但现在我是在利用更多的电脑作为奴隶有趣。这将是很方便的分发计算机上火花从泊坞窗的容器,并自动让他们自己连接到一个硬编码星火主IP。这个很短的作品已经不过了,但是我无法在从属容器上配置正确的SPARK_LOCAL_IP(或start-slave.sh的--host参数)。
So far I have run Spark only on Linux machines and VMs (bridged networking) but now I am interesting on utilizing more computers as slaves. It would be handy to distribute a Spark Slave Docker container on computers and having them automatically connecting themselves to a hard-coded Spark master ip. This short of works already but I am having trouble configuring the right SPARK_LOCAL_IP (or --host parameter for start-slave.sh) on slave containers.
我想我正确配置SPARK_PUBLIC_DNS env变量以匹配主机的网络可访问的ip(从10.0.xx地址空间),至少它显示在Spark主Web UI上,并可由所有计算机访问。
I think I correctly configured the SPARK_PUBLIC_DNS env variable to match the host machine's network-accessible ip (from 10.0.x.x address space), at least it is shown on Spark master web UI and accessible by all machines.
我还按照,但在我的情况下,Spark主机在另一台机器上运行,而不在Docker内。我正在网络中的另一台机器上启动Spark工作,可能还会运行一个从站本身。
I have also set SPARK_WORKER_OPTS and Docker port forwards as instructed at http://sometechshit.blogspot.ru/2015/04/running-spark-standalone-cluster-in.html, but in my case the Spark master is running on an other machine and not inside Docker. I am launching Spark jobs from an other machine within the network, possibly also running a slave itself.
我尝试过的东西:
- 根本不配置SPARK_LOCAL_IP,从机绑定到容器的ip(如172.17.0.45),无法从主机或驱动程序连接,计算仍然可以工作大部分时间永远
- 绑定到0.0.0.0,奴隶聊天,建立一些连接,但它死了,另一个奴隶显示并消失,他们继续循环这样
- 绑定到主机ip,启动失败,因为ip在容器内不可见,但由于配置了端口转发,所以可以访问它
'p>不知为何没有被使用的配置SPARK_PUBLIC_DNS连接到从站时?我认为SPARK_LOCAL_IP只会影响本地绑定,但不会泄露给外部计算机。
I wonder why isn't the configured SPARK_PUBLIC_DNS being used when connecting to slaves? I thought SPARK_LOCAL_IP would only affect on local binding but not being revealed to external computers.
在他们指示将SPARK_LOCAL_IP设置为驱动程序,主服务器和工作进程的群集可寻址主机名,这是唯一的选项吗?我会避免额外的DNS配置,只需使用IPS在计算机之间配置的流量。或者有一个简单的方法来实现这一点?
At https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/troubleshooting/connectivity_issues.html they instruct to "set SPARK_LOCAL_IP to a cluster-addressable hostname for the driver, master, and worker processes", is this the only option? I would avoid the extra DNS configuration and just use ips to configure traffic between computers. Or is there an easy way to achieve this?
编辑:
总结当前设置:
To summarize the current set-up:
- 主机在Linux上运行(VM上的VirtualBox在Windows上具有桥接网络)
- 驱动程序从其他人提交作业Windows机器,工作很好
- 用于启动从站的Docker映像分发为保存的.tar.gz文件,加载(curl xyz | gunzip | docker load)并在其他机器上启动在网络中,具有私有/公共ip配置的这个probem
推荐答案
我想我发现一个解决方案(一个Spark容器/主机操作系统):
I think I found a solution for my use-case (one Spark container / host OS):
- 使用
- net host
与docker运行
=>主机的eth0在容器中可见 - 设置
SPARK_PUBLIC_DNS
和SPARK_LOCAL_IP
到主机的ip,忽略docker0的172.xxx地址
- Use
--net host
withdocker run
=> host's eth0 is visible in the container - Set
SPARK_PUBLIC_DNS
andSPARK_LOCAL_IP
to host's ip, ignore the docker0's 172.x.x.x address
'p>火花可以绑定到所述主机的IP和其他计算机进行通信,以它为好,端口转发负责剩下的照顾。 DNS或任何复杂的配置不需要,我没有彻底测试,但到目前为止这么好。
Spark can bind to the host's ip and other machines communicate to it as well, port forwarding takes care of the rest. DNS or any complex configs were not needed, I haven't thoroughly tested this but so far so good.
编辑:请注意,这些说明适用于Spark 1.x,在Spark 2.x中只需要 SPARK_PUBLIC_DNS
,我认为 SPARK_LOCAL_IP
已被弃用。
Note that these instructions are for Spark 1.x, at Spark 2.x only SPARK_PUBLIC_DNS
is required, I think SPARK_LOCAL_IP
is deprecated.
这篇关于具有Docker容器的独立群集上的Spark SPARK_PUBLIC_DNS和SPARK_LOCAL_IP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!