问题描述
在AWS Glue中运行python作业时,出现错误:
When running a python job in AWS Glue I get the error:
原因:容器因超出内存限制而被YARN杀死.已使用5.6 GB的5.5 GB物理内存.考虑提高spark.yarn.executor.memoryOverhead
在脚本开头运行此代码:
When running this in the beginning of the script:
print '--- Before Conf --'
print 'spark.yarn.driver.memory', sc._conf.get('spark.yarn.driver.memory')
print 'spark.yarn.driver.cores', sc._conf.get('spark.yarn.driver.cores')
print 'spark.yarn.executor.memory', sc._conf.get('spark.yarn.executor.memory')
print 'spark.yarn.executor.cores', sc._conf.get('spark.yarn.executor.cores')
print "spark.yarn.executor.memoryOverhead", sc._conf.get("spark.yarn.executor.memoryOverhead")
print '--- Conf --'
sc._conf.setAll([('spark.yarn.executor.memory', '15G'),('spark.yarn.executor.memoryOverhead', '10G'),('spark.yarn.driver.cores','5'),('spark.yarn.executor.cores', '5'), ('spark.yarn.cores.max', '5'), ('spark.yarn.driver.memory','15G')])
print '--- After Conf ---'
print 'spark.driver.memory', sc._conf.get('spark.driver.memory')
print 'spark.driver.cores', sc._conf.get('spark.driver.cores')
print 'spark.executor.memory', sc._conf.get('spark.executor.memory')
print 'spark.executor.cores', sc._conf.get('spark.executor.cores')
print "spark.executor.memoryOverhead", sc._conf.get("spark.executor.memoryOverhead")
我得到以下输出:
spark.yarn.driver.memory无
spark.yarn.driver.memory None
spark.yarn.driver.cores无
spark.yarn.driver.cores None
spark.yarn.executor.memory无
spark.yarn.executor.memory None
spark.yarn.executor.cores无
spark.yarn.executor.cores None
spark.yarn.executor.memoryOverhead无
spark.yarn.executor.memoryOverhead None
--- Conf-
--- Conf之后---
--- After Conf ---
spark.yarn.driver.memory 15G
spark.yarn.driver.memory 15G
spark.yarn.driver.cores 5
spark.yarn.driver.cores 5
spark.yarn.executor.memory 15G
spark.yarn.executor.memory 15G
spark.yarn.executor.cores 5
spark.yarn.executor.cores 5
spark.yarn.executor.memory开销10G
spark.yarn.executor.memoryOverhead 10G
spark.yarn.executor.memoryOverhead似乎已设置,但为什么无法识别?我仍然遇到相同的错误.
It seems like the spark.yarn.executor.memoryOverhead is set but why is it not recognized? I still get the same error.
我还看到了其他有关设置spark.yarn.executor.memoryOverhead的问题的帖子,但是在似乎已设置并且无法正常工作时却没有?
I have seen other posts regarding problems with setting the spark.yarn.executor.memoryOverhead but not when it seems to be set and not working?
推荐答案
-
打开胶水>作业>编辑您的作业>脚本库和作业参数(可选)>底部附近的作业参数
Open Glue > Jobs > Edit your Job > Script libraries and job parameters(optional) > Job parameters near the bottom
设置以下>键:--conf值:spark.yarn.executor.memoryOverhead = 1024
Set the following > key: --conf value: spark.yarn.executor.memoryOverhead=1024
这篇关于AWS Glue-无法设置spark.yarn.executor.memoryOverhead的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!