问题描述
我正在尝试EMR,我尝试运行一个非常简单的spark程序
I'm experimenting with EMR a bit I try to run a very simple spark programm
from pyspark.sql.types import IntegerType
mylist = [1, 2, 3, 4]
df = spark.createDataFrame(mylist, IntegerType()).show()
df.write.parquet('/path/to/save', mode='overwrite')
我通过在AWS EMR web-console
中添加一个步骤来启动应用程序,然后从s3
中选择应用程序,然后选择deploy mode cluster
,然后将其余内容留空.
I launch the app by adding a step in the AWS EMR web-console
I select the app from s3
select deploy mode cluster
and leave the rest blank.
该应用程序甚至无法启动,可能是因为我收到以下错误代码:Application application_1564485869414_0002 failed 2 times due to AM Container for appattempt_1564485869414_0002_000002 exited with exitCode: 13
The app doesn't even launch probably because I get the following error code:Application application_1564485869414_0002 failed 2 times due to AM Container for appattempt_1564485869414_0002_000002 exited with exitCode: 13
我在做什么错了?
推荐答案
您尝试的代码中未定义您的spark
变量.可能是造成此问题的原因,因为您没有将Spark上下文传递给应用程序.
Your spark
variable isn't defined in the code you tried. It might be causing the issue since you are not passing a spark context to the app.
尝试添加:
from pyspark.sql import SparkSession
spark = SparkSession\
.builder\
.getOrCreate()
在使用spark.createDataFrame(...)
Before using spark.createDataFrame(...)
这篇关于AWS EMR中的Spark步骤失败,退出代码为13的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!