问题描述
我正在运行 kinesis plus spark 应用程序https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
I am running kinesis plus spark applicationhttps://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
我运行如下
ec2 实例上的命令:
command on ec2 instance :
./spark/bin/spark-submit --class org.apache.spark.examples.streaming.myclassname --master yarn-cluster --num-executors 2 --driver-memory 1g --executor-memory 1g --executor-cores 1 /home/hadoop/test.jar
我已经在 EMR 上安装了 spark.
I have installed spark on EMR.
EMR details
Master instance group - 1 Running MASTER m1.medium
1
Core instance group - 2 Running CORE m1.medium
我低于 INFO 并且它永远不会结束.
I am getting below INFO and it never ends.
15/06/14 11:33:23 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
15/06/14 11:33:23 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2048 MB per container)
15/06/14 11:33:23 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/06/14 11:33:23 INFO yarn.Client: Setting up container launch context for our AM
15/06/14 11:33:23 INFO yarn.Client: Preparing resources for our AM container
15/06/14 11:33:24 INFO yarn.Client: Uploading resource file:/home/hadoop/.versions/spark-1.3.1.e/lib/spark-assembly-1.3.1-hadoop2.4.0.jar -> hdfs://172.31.13.68:9000/user/hadoop/.sparkStaging/application_1434263747091_0023/spark-assembly-1.3.1-hadoop2.4.0.jar
15/06/14 11:33:29 INFO yarn.Client: Uploading resource file:/home/hadoop/test.jar -> hdfs://172.31.13.68:9000/user/hadoop/.sparkStaging/application_1434263747091_0023/test.jar
15/06/14 11:33:31 INFO yarn.Client: Setting up the launch environment for our AM container
15/06/14 11:33:31 INFO spark.SecurityManager: Changing view acls to: hadoop
15/06/14 11:33:31 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/06/14 11:33:31 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/06/14 11:33:31 INFO yarn.Client: Submitting application 23 to ResourceManager
15/06/14 11:33:31 INFO impl.YarnClientImpl: Submitted application application_1434263747091_0023
15/06/14 11:33:32 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:32 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1434281611893
final status: UNDEFINED
tracking URL: http://172.31.13.68:9046/proxy/application_1434263747091_0023/
user: hadoop
15/06/14 11:33:33 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:34 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:35 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:36 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:37 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:38 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:39 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:40 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:41 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
有人可以告诉我为什么它不起作用吗?
Could somebody please let me know as why it's not working ?
推荐答案
当多个用户试图同时在我们的集群上运行时,我遇到了这个确切的问题.解决方法是更改调度程序的设置.
I had this exact problem when multiple users were trying to run on our cluster at once. The fix was to change setting of the scheduler.
在文件 /etc/hadoop/conf/capacity-scheduler.xml
中,我们将属性 yarn.scheduler.capacity.maximum-am-resource-percent
从0.1
到 0.5
.
In the file /etc/hadoop/conf/capacity-scheduler.xml
we changed the property yarn.scheduler.capacity.maximum-am-resource-percent
from 0.1
to 0.5
.
更改此设置会增加可分配给应用程序主服务器的资源比例,从而增加可能同时运行的主服务器数量,从而增加可能的并发应用程序数量.
Changing this setting increases the fraction of the resources that is made available to be allocated to application masters, increasing the number of masters possible to run at once and hence increasing the number of possible concurrent applications.
这篇关于application_ 的申请报告(状态:已接受)永远不会因 Spark Submit 而结束(在 YARN 上使用 Spark 1.2.0)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!