问题描述
我正在 Spark 上运行 apache 光束工作负载.我用 32GB 的内存初始化了工作线程(slave 用 -c 2 -m 32G
运行).Spark submit 将驱动程序内存设置为 30g,将执行程序内存设置为 16g.但是,执行程序失败并显示 java.lang.OutOfMemoryError: Java heap space
.
I am running an apache beam workload on Spark. I initialized the workers with 32GB of memory (slave run with -c 2 -m 32G
). Spark submit sets driver memory to 30g and executor memory to 16g. However, executors fail with java.lang.OutOfMemoryError: Java heap space
.
master gui 表示每个执行器的内存为 1024M.另外,我看到所有的java进程都是用-Xmx 1024m
启动的.这意味着 spark-submit 不会将其执行程序设置传播给执行程序.
The master gui indicates that memory per executor is 1024M. In addition, I see that all java processes are launched with -Xmx 1024m
. This means spark-submit doesn't propagate it's executor settings to the executors.
管道选项如下:
--runner PortableRunner \
--job_endpoint=localhost:8099 \
--environment_type=PROCESS \
--environment_config='{"command": "$HOME/beam/sdks/python/container/build/target/launcher/linux_amd64/boot"}'
作业端点以默认方式设置:docker run --rm --network=host --name spark-jobservice apache/beam_spark_job_server:latest --spark-master-url=spark://$HOSTNAME:7077
Job endpoint is setup in the default way:docker run --rm --network=host --name spark-jobservice apache/beam_spark_job_server:latest --spark-master-url=spark://$HOSTNAME:7077
如何确保设置传播到执行程序?
How do I make sure the settings propagate to the executors?
更新:我将 conf/spark-defaults.conf 设置为
Update:I set conf/spark-defaults.conf to
spark.driver.memory 32g
spark.executor.memory 32g
和 conf/spark-env.sh 到
and conf/spark-env.sh to
SPARK_EXECUTOR_MEMORY=32g
并重启集群并重新启动一切,执行器内存仍然限制为1024M
and restarted the cluster and relaunched everything, and executor memory is still limited to 1024M
推荐答案
我找到了原因和解决方法.
I found the reason and a workaround.
jobserver 容器在内部运行自己的 spark 分发版,因此在本地计算机上的 spark 分发版中配置的设置无效.
The jobserver container is running internally its own spark distribution, so the settings configured in the spark distribution on your local machine have no effect.
因此,解决方案是更改 jobserver 容器中的配置,例如通过在启动时传递相应的环境变量:
The solution is thus to change the configuration in the jobserver container, for instance by passing the corresponding environment variable when launching it:
docker run -e SPARK_EXECUTOR_MEMORY=32g --rm --network=host --name spark-jobservice apache/beam_spark_job_server:latest --spark-master-url=spark://$HOSTNAME:7077
这篇关于(Apache Beam) 无法增加执行器内存 - 尽管使用了多个设置,它还是固定在 1024M的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!