问题描述
我有一个要使用oozie计划的python脚本.我正在使用Oozie shell动作来调用脚本.脚本中有一条直线指令.运行oozie工作流程时,出现错误"sh:beeline:命令未找到" .如果我从边缘节点手动运行此脚本或仅运行beeline命令,则运行情况会很好.我的数据平台是Hortonworks 2.6.下面是我的workflow.xml和python脚本:
I have a python script that I want to schedule using oozie. I am using Oozie shell action for invoking the script. There is a beeline command in the script. When I run the oozie workflow, I get error "sh: beeline: command not found". If I run this script or just the beeline command manually from edge node, it runs perfectly fine. My data platform is Hortonworks 2.6. Below is my workflow.xml and python script:
Workflow.xml
Workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.3" name="hive2-wf">
<credentials>
<credential name='hcat-creds' type='hcat'>
<property>
<name>hcat.metastore.uri</name>
<value>thrift://host:9083</value>
</property>
<property>
<name>hcat.metastore.principal</name>
<value>hive/[email protected]</value>
</property>
</credential>
</credentials>
<start to="python-node"/>
<action name="python-node" cred="hcat-creds">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>run_validations.py</exec>
<argument>--jdbcURL</argument><argument>${jdbcURL}</argument>
<argument>--jdbcPrincipal</argument><argument>${jdbcPrincipal}</argument>
<env-var>PYTHONPATH=/bin/python</env-var>
<env-var>PYTHON_EGG_CACHE=/tmp</env-var>
<env-var>PATH=/usr/bin</env-var>
<env-var>HADOOP_CLASSPATH=${HADOOP_CLASSPATH}</env-var>
<file>run_validations.py</file>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
脚本:
Script:
#!/usr/bin/env python2
import sys, os, commands, datetime, time ,getpass, errno
from optparse import OptionParser
import subprocess
from subprocess import Popen, PIPE
def arg_handle():
usage = "usage: run_validations.py [options]"
parser = OptionParser(usage)
parser.add_option("-u", "--jdbcURL",dest="jdbcURL",help="jdbcURL")
parser.add_option("-p", "--jdbcPrincipal",dest="jdbcPrincipal",help="jdbcPrincipal")
(options, args) = parser.parse_args()
print("run_validations.py -> Input : " + str(options))
return options
def main():
print("run_validations.py -> Started Run_validations.py")
options = arg_handle()
print("JDBC URL : "+options.jdbcURL)
print("JDBC PRINCIPAL : "+options.jdbcPrincipal)
beeline_connection = options.jdbcURL+";principal="+options.jdbcPrincipal
hive_cmd = 'beeline -u "'+beeline_connection+'" -e "select 1+2;"'
print("Invoked :"+hive_cmd)
rc,out = commands.getstatusoutput(hive_cmd)
if(rc==0):
print("RC : "+str(rc))
print("Output :")
print(out)
else:
print("RC : "+str(rc))
print("Output :")
print(out)
if __name__ == "__main__":
main()
输出
Output
>>> Invoking Shell command line now >>
Stdoutput run_validations.py -> Started Run_validations.py
Stdoutput run_validations.py -> Input : {'jdbcURL': 'jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2', 'jdbcPrincipal': 'hive/[email protected]'}
Stdoutput Invoked :beeline -u "jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/[email protected]" -e "select 1+2;"
Stdoutput RC : 32512
Stdoutput Output :
Stdoutput sh: beeline: command not found
Exit code of the Shell command 0
<<< Invocation of Shell command completed <<<
有人可以告诉我我想念的是什么吗?
Could someone please tell me what it is that I am missing?
推荐答案
Oozie在Hadoop集群中除边缘节点(您在其中测试了beeline或python脚本的位置)之外的其他节点(可能是数据节点之一)中执行shell操作).在边缘节点上必须安装beeline,这就是您能够对其进行测试的原因.
Oozie executes shell action in a different node(possibly one of the data nodes) in the Hadoop cluster other than the edge node(where you tested beeline or python script). In the edge node beeline must be installed which is why you are able to test it.
但是实际问题是正在执行shell动作的节点似乎没有安装直线.您可以登录并检查beeline是否可以访问该节点.
But the actual problem being the node where shell action is being executed does not seem to have beeline installed. You can log in and check for beeline if you got access to that node.
我建议您尝试结合使用蜂巢操作和外壳操作来完成您要执行的任务.
I would suggest you try a combination of hive actions and shell actions to achieve the task you are trying to do.
这篇关于“未找到直线命令";从python脚本执行beeline命令时出错(从oozie shell操作调用)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!