火花中的执行者数和执行者核数有什么区别

火花中的执行者数和执行者核数有什么区别

本文介绍了纱:火花中的执行者数和执行者核数有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在AWS AWS上学习Spark.在此过程中,我试图了解执行程序数(--num-executors)和执行程序核心数(--executor-cores)之间的区别.有人可以在这里告诉我吗?

I am learning Spark on AWS EMR. In the process I am trying to understand the difference between number of executors(--num-executors) and executor cores (--executor-cores). Can any one please tell me here?

当我尝试提交以下工作时,我也收到错误消息:

Also when I am trying to submit the following job, I am getting error:

spark-submit --deploy-mode cluster --master yarn --num-executors 1 --executor-cores 5   --executor-memory 1g -–conf spark.yarn.submit.waitAppCompletion=false wordcount.py s3://test/spark-example/input/input.txt s3://test/spark-example/output21

Error: Unrecognized option: -–conf

推荐答案

执行者数量是将执行您的应用程序的不同纱线容器(认为是进程/JVM)的数量.

Number of executors is the number of distinct yarn containers (think processes/JVMs) that will execute your application.

执行者核心数是您在每个执行者(容器)内部获得的线程数.

Number of executor-cores is the number of threads you get inside each executor (container).

因此,您的spark应用程序的并行度(正在运行的并发线程/任务数)为#executors X #executor-cores.如果您有10个执行者和5个执行者核心,那么(希望)同时运行50个任务.

So the parallelism (number of concurrent threads/tasks running) of your spark application is #executors X #executor-cores. If you have 10 executors and 5 executor-cores you will have (hopefully) 50 tasks running at the same time.

这篇关于纱:火花中的执行者数和执行者核数有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 01:18