本文介绍了Spark Standalone Number Executors/Cores 控制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个具有 16 个内核和 64GB RAM 的 Spark 独立服务器.我在服务器上同时运行 master 和 worker.我没有启用动态分配.我在 Spark 2.0 上

So I have a spark standalone server with 16 cores and 64GB of RAM. I have both the master and worker running on the server. I don't have dynamic allocation enabled. I am on Spark 2.0

我不明白的是什么时候提交我的工作并指定:

What I dont understand is when I submit my job and specify:

--num-executors 2
--executor-cores 2 

应该只占用 4 个内核.然而,当提交作业时,它会占用所有 16 个内核并启动 8 个执行程序,绕过 num-executors 参数.但是,如果我将 executor-cores 参数更改为 4,它会相应地进行调整,并且 4 个 executor 将启动.

Only 4 cores should be taken up. Yet when the job is submitted, it takes all 16 cores and spins up 8 executors regardless, bypassing the num-executors parameter. But if I change the executor-cores parameter to 4 it will adjust accordingly and 4 executors will spin up.

推荐答案

免责声明:我真的不知道 --num-executors 是否应该在独立模式.我还没有看到它在 YARN 之外使用.

Disclaimer: I really don't know if --num-executors should work or not in standalone mode. I haven't seen it used outside YARN.

注意:正如 Marco --num-executors 不再在 YARN 上使用.

通过结合 spark.cores.maxspark.executor.cores 其中执行者的数量确定为:

You can effectively control number of executors in standalone mode with static allocation (this works on Mesos as well) by combining spark.cores.max and spark.executor.cores where number of executors is determined as:

floor(spark.cores.max / spark.executor.cores)

例如:

--conf "spark.cores.max=4" --conf "spark.executor.cores=2"

这篇关于Spark Standalone Number Executors/Cores 控制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-10 23:18