具有多个执行程序的 Spark 独立配置

本文介绍了具有多个执行程序的 Spark 独立配置的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试设置一个独立的 Spark 2.0 服务器来并行处理分析功能.为此，我希望有一个具有多个执行程序的工人.

我正在使用:

独立的 Spark 2.0
8 核
24gig 内存
Windows 服务器 2008
pyspark(虽然这看起来无关)

这仅用于概念验证，但我希望有 8 个执行程序，每个内核一个.

我已尝试关注有关此主题的其他主题，但由于某种原因它对我不起作用.IE:

解决方案

我相信你混淆了本地和独立模式:

本地模式是一种开发工具，其中所有进程都在单个 JVM 中执行.通过将 master 设置为 local、local[*] 或 local[n]，应用程序以本地模式启动.spark.executor.cores 和 spark.executor.cores 在本地模式下不适用，因为只有一个嵌入式执行器.
独立模式需要独立的 Spark集群.它需要一个主节点(可以使用 SPARK_HOME/sbin/start-master.sh 脚本启动)和至少一个工作节点(可以使用 SPARK_HOME/sbin/start-slave 启动.sh 脚本).
SparkConf 应该使用主节点地址来创建(spark://host:port).

I'm trying to setup a standalone Spark 2.0 server to process an analytics function in parallel. To do this I want to have a single worker with multiple executors.

I'm using :

Standalone Spark 2.0
8 Cores
24gig RAM
windows server 2008
pyspark (although this appears unrelated)

This is just for pure proof of concept purposes but I want to have 8 executors, one per each core.

I've tried to follow the other threads on this topic but for some reason it's not working for me. IE:Spark Standalone Number Executors/Cores Control

My configuration is as follows:

confspark-defaults.conf

spark.cores.max = 8
spark.executor.cores = 1

I have tried to also change my spark-env.sh file to no avail. Instead what is happening is that it shows that my 1 worker only has 1 executor on it. As you can see below, it still shows the standalone with 1 executor with 8 cores to it.

解决方案

I believe you mixed up local and standalone modes:

Local mode is a development tool where all processes are executed inside a single JVM. Application is started in a local mode by setting master to local, local[*] or local[n]. spark.executor.cores and spark.executor.cores are not applicable in the local mode because there is only one embedded executor.
Standalone mode requires a standalone Spark cluster. It requires a master node (can be started using SPARK_HOME/sbin/start-master.sh script) and at least one worker node (can be started using SPARK_HOME/sbin/start-slave.sh script).
SparkConf should use master node address to create (spark://host:port).

这篇关于具有多个执行程序的 Spark 独立配置的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！