问题描述
这是正常的火花不会船舶自动由主到从JAR文件(包含火花应用程序)?在早期版本(与亚马逊Web服务使用),它的工作!难道因为版本1.2.2或引起集群问题,而公共DNS地址,这个功能的改变?或者只在AWS集群工作这个自动复制罐子功能?
Is it normal that Spark won't ship the JAR file (containing the spark application) automatically from master to slave? In earlier versions (and used on Amazon Webservices) it worked! Did this functionality change since version 1.2.2 or is the problem caused by clusters without public dns addresses??? Or is this "copy the jar automatically" function only working in an AWS cluster?
下面我提交的呼叫:
./spark-submit --class prototype.Test --master spark://192.168.178.128:7077 --deploy-mode cluster ~/test.jar
信息:由--jars参数中列出的文件被复制到工人
Info: the files listed by --jars parameter are "copied" to the workers.
推荐答案
这是我自己的错! - >不使用参数--deploy模式为标准群集,其中该驱动过程计划主节点上运行的使用
That was my own fault! -> don't use parameter --deploy-mode for usage of a standard cluster, where the driver process is planned to run on the master node.
请参阅星火文档:
- 部署模式:无论你的工作节点上的驱动程序(集群)或本地部署为外部客户端(客户端)(默认:客户端)[...]
一个常见的部署策略是从网关机器在物理上是同一地点与工人的机器(在一个独立的EC2集群例如主节点)递交申请。在这种设置中,客户端模式是合适的。在客户端模式,司机充当客户机到群集的火花提交过程中,直接推出。输入和应用程序的输出被连接到控制台。因此,这种模式是特别适合于涉及REPL(例如火花壳)的应用程序。
[...]
A common deployment strategy is to submit your application from a gateway machine that is physically co-located with your worker machines (e.g. Master node in a standalone EC2 cluster). In this setup, client mode is appropriate. In client mode, the driver is launched directly within the spark-submit process which acts as a client to the cluster. The input and output of the application is attached to the console. Thus, this mode is especially suitable for applications that involve the REPL (e.g. Spark shell). [...]
这篇关于阿帕奇星火:不发货的JAR文件火花提交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!