问题描述
我有一个6节点cloudera的hadoop集群,我试图从oozie的一个sqoop动作连接到一个oracle数据库。我已经复制了我的ojdbc6 .jar到所有节点的sqoop lib位置(对我来说恰好是在/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib/)并验证我可以从所有6个节点运行一个简单的sqoop eval。
现在,当我使用Oozie的sqoop操作运行相同的命令时,我得到Could没有加载db driver class:oracle.jdbc.OracleDriver
我已经阅读了关于使用共享库,当我们谈论我的任务/动作/工作流特定依赖关系时,这对我来说是有道理的。但是我看到一个JDBC驱动程序安装作为sqoop的扩展,所以我认为它属于sqoop安装lib。
现在的问题是,而sqoop看到这个ojdbc6 jar我已经把它的lib文件夹,我的Oozie工作流怎么看不到?
这是预期的还是我缺少的东西?
除此之外,你有什么想法的JDBC驱动程序jar的适当位置在哪里?
提前感谢! / p>
JDBC驱动程序jar(和任何依赖于它的jar)应该放在HDFS上的Oozie sharelib文件夹中。我正在运行Hortonworks Data Platform 1.2而不是Cloudera 4.2,所以细节可能会有所不同,但是我的JDBC驱动程序位于 / user / oozie / share / lib / sqoop
中。这应该允许您通过Oozie使用JDBC运行Sqoop。
没有必要将数据节点上的sqoop lib中的JDBC驱动程序jar放入。在我的setupt中,我无法从我的数据节点的命令行运行一个简单的 sqoop eval
。我理解为什么你认为这将工作的逻辑。 JDBC驱动程序jar需要在HDFS上的原因是所有的数据节点都可以访问它。您的解决方案应该达到同样的目标。我对Oozie的内部工作不够熟悉,为什么要使用sharelib,但是您的解决方案不是。
I have a 6 node cloudera based hadoop cluster and I'm trying to connect to an oracle database from a sqoop action in oozie.
I have copied my ojdbc6.jar into the sqoop lib location (which for me happens to be at: /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/sqoop/lib/ ) on all the nodes and have verified that I can run a simple 'sqoop eval' from all the 6 nodes.
Now when I run the same command using Oozie's sqoop action, I get "Could not load db driver class: oracle.jdbc.OracleDriver"
I have read this article about using shared libs and it makes sense to me when we're talking about my task/action/workflow specific dependencies. But I see a JDBC driver installation as an extention to sqoop and so I think it belongs in the sqoop installation lib.
Now the question is, while sqoop sees this ojdbc6 jar I have put into it's lib folder, how come my Oozie workflow doesn't see it?
Is this something expected or am I missing something?
As an aside, what do you guy think about where is the appropriate location for a JDBC driver jar?
Thanks in advance!
The JDBC driver jar (and any jars it depends on) should go in your Oozie sharelib folder on HDFS. I'm running Hortonworks Data Platform 1.2 instead of Cloudera 4.2 so the details may vary, but my JDBC driver is located in /user/oozie/share/lib/sqoop
. This should allow you to run Sqoop with the JDBC via Oozie.
It is not necessary to put to the JDBC driver jar in the sqoop lib on the data nodes. In my setupt I can't run a simple sqoop eval
from the command line on my data nodes. I understand the logic for why you thought this would work. The reason the JDBC driver jar needs to be on HDFS is so that all the data nodes have access to it. Your solution should accomplish the same goal. I'm not familiar enough with the inner workings of Oozie to say why using the sharelib works but your solution does not.
这篇关于Oozie + Sqoop:JDBC Driver Jar位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!