问题描述
我的星火集群有1个主站和3工人(4单独的机器,每台机器1核心)和其他设置,如下面的图片,其中 spark.cores.max 设置为 3 和 spark.executor.cores 也 3 (在 PIC-1 )
My Spark cluster has 1 master and 3 workers (on 4 separate machines, each machine with 1 core), and other settings are as in the picture below, where spark.cores.max is set to 3, and spark.executor.cores also 3 (in pic-1)
但是,当我提出我的工作,星火产业集群,从星火网络的用户界面,我可以看到只有一个执行程序用于(根据使用的内存和 RDD块在 PIC-2 ),但不是所有的执行者。在这种情况下,处理速度比我预想的要慢得多。
But when I submit my job to Spark cluster, from the Spark web-UI I can see only one executor is used (according to used memory and RDD blocks in pic-2), but not all of the executors. In this case the processing speed is much slower than I expected.
由于我设置了最大的核心为3,应该不是所有的执行人用于这项工作?
Since I've set the max cores to be 3, shouldn't all the executors be used to this job?
如何configurate星火运行电流的工作电流工作分配,而不是只有一个执行者到所有执行人,?
How to configurate Spark to distribute current job to all executors, instead of only one executor running current job?
非常感谢。
------------------ PIC-1 :
------------------pic-1:
------------------ PIC-2 :
------------------pic-2:
推荐答案
你说你正在运行的两个接收器,什么样的接收机是他们(卡夫卡,HDFS,微博?)
You said you are running two receivers, what kind of Receivers are they (Kafka, Hdfs, Twitter ??)
您使用的是哪个版本的火花?
Which spark version are you using?
在我的经验,如果您使用的不是文件接收方以外的任何接收器,那么它会永久占用1核心。
所以,当你说你有2个接收器,然后2个内核将永久用于接收数据,所以你只剩下1个核心是做的工作。
In my experience, if you are using any Receiver other than file receiver, then it will occupy 1 core permanently.So when you say you have 2 receivers, then 2 cores will be permanently used for receiving the data, so you are left with only 1 core which is doing the work.
请张贴星火主的主页截图为好。约伯的流页面截图。
Please post the Spark master hompage screenshot as well. And Job's Streaming page screenshot.
这篇关于为什么星火未分配任务给所有执行者,但只有一个执行者?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!