问题描述
我目前正在设置一个使用Spark动作的Oozie工作流程.我使用的Spark代码可以正常运行,并且在本地和YARN上均经过测试.但是,当将其作为Oozie工作流程运行时,出现以下错误:
I am currently setting up an Oozie workflow that uses a Spark action. The Spark code that I use works correctly, tested on both local and YARN. However, when running it as an Oozie workflow I am getting the following error:
Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [1]
在阅读了此错误后,我发现最常见的原因是Oozie sharelib出现问题.我已将所有Spark jar文件添加到hdfs上的Oozie/user/oozie/share/lib/spark,重新启动Oozie并运行sudo -u oozie oozie admin -oozie http://192.168.26.130:11000/oozie -sharelibupdate
以确保正确更新了共享库.不幸的是,这些都没有阻止错误的发生.
Having read up on this error, I saw that the most common cause was a problem with Oozie sharelibs. I have added all Spark jar files to the Oozie /user/oozie/share/lib/spark on hdfs, restarted Oozie and run sudo -u oozie oozie admin -oozie http://192.168.26.130:11000/oozie -sharelibupdate
to ensure the sharelibs are properly updated. Unforunately none of this has stopped the error occurring.
我的工作流程如下:
<workflow-app xmlns='uri:oozie:workflow:0.4' name='SparkBulkLoad'>
<start to = 'bulk-load-node'/>
<action name = 'bulk-load-node'>
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn</master>
<mode>client</mode>
<name>BulkLoader</name>
<jar>${nameNode}/user/spark-test/BulkLoader.py</jar>
<spark-opts>
--num-executors 3 --executor-cores 1 --executor-memory 512m --driver-memory 512m\
</spark-opts>
</spark>
<ok to = 'end'/>
<error to = 'fail'/>
</action>
<kill name = 'fail'>
<message>
Error occurred while bulk loading files
</message>
</kill>
<end name = 'end'/>
</workflow-app>
和job.properties如下:
and job.properties is as follows:
nameNode=hdfs://192.168.26.130:8020
jobTracker=http://192.168.26.130:8050
queueName=spark
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/spark-test/workflow.xml
workflowAppUri=${nameNode}/user/spark-test/BulkLoader.py
任何建议将不胜感激.
推荐答案
我还指定了libpath
I have also specified the libpath
oozie.libpath=<path>/oozie/share/lib/lib_<timestamp>
这是您在编写命令后看到的值
It is the value you see after the command you wrote
sudo -u oozie oozie admin -oozie http://192.168.26.130:11000/oozie -sharelibupdate
示例:
[ShareLib update status]
sharelibDirOld = hdfs://nameservice1/user/oozie/share/lib/lib_20190328034943
host = http://vghd08hr.dc-ratingen.de:11000/oozie
sharelibDirNew = hdfs://nameservice1/user/oozie/share/lib/lib_20190328034943
status = Successful
可选:您还可以在Cloudera文件夹中指定纱线配置:
Optional:You can also specify the yarn configuration within Cloudera folder:
oozie.launcher.yarn.app.mapreduce.am.env=/opt/SP/apps/cloudera/parcels/SPARK2-2.2.0.cloudera4-1.cdh5.13.3.p0.603055/lib/spark2
但是这可能无法解决问题.我的另一个提示是,如果您使用的是Spark 1.x,则此文件夹在oozie sharelib文件夹中是必需的
BUTThis might not solve the issue. The other hint I have is if you are using Spark 1.x this folder is necessary in your oozie sharelib folder
/user/oozie/share/lib/lib_20190328034943/spark2/oozie-sharelib-spark.jar
如果将其复制到spark2文件夹中,则可以解决缺少SparkMain"的问题,但需要其他依赖项(这在我的环境中可能是个问题).我认为值得一试,因此请复制并粘贴该lib,运行您的工作,然后查看日志.
If you copy it in your spark2 folder, it solves the issue of the "missing SparkMain" but ask for other dependencies (it might be a problem in my environment). I think it worth a try, so copy and paste the lib, run your job, and see the logs.
这篇关于Oozie火花操作错误:主类[org.apache.oozie.action.hadoop.SparkMain],退出代码[1]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!