本文介绍了Sqoop自由格式查询在Hue / Oozie中导致无法识别的参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用自由格式查询运行sqoop命令,因为我需要执行聚合。它通过Hue界面提交,作为Oozie工作流程。以下是命令和查询的缩小版本。处理命令时,--query语句(用引号括起来)导致查询的每个部分被解释为无法识别的参数,如命令后面的错误所示。另外,目标目录被误解。什么阻止了它的运行,以及可以采取哪些措施来解决它? $ {env}和$ {shard}变量正在被正确解析,正如最后一条错误消息所反映的。

I am attempting to run a sqoop command with a free-form query, because I need to perform an aggregation. It's being submitted via the Hue interface, as an Oozie workflow. The following is a scaled-down version of the command and query. When the command is processed, the "--query" statement (enclosed in quotes) results in each portion of the query to be interpreted as unrecognized arguments, as shown in the error following the command. In addition, the target directory is being misinterpreted. What is preventing this from running, and what can be done to resolve it? The ${env} and ${shard} variables are being properly parsed, as reflected in the last error message.

谢谢!

===========

===========

import --connect jdbc:mysql:// irbasedw- $ {shard}。 db.xxxx.net:3417/irbasedw_${shard}?dontTrackOpenResources=true&defaultFetchSize=10000&useCursorFetch=true --username iretl --password-file /irdw/${env}/lib/.passwordBaseDw --table agg_daily_activity_performance_stage -m 1 --querySELECT SUM(click_count)FROM agg_daily_activity_performance_stage WHERE \ CONDITIONS GROUP BY 1--target-dir / irdw / $ {env} / legacy / agg / activity_performance / text / shard _ $ {shard}

import --connect jdbc:mysql://irbasedw-${shard}.db.xxxx.net:3417/irbasedw_${shard}?dontTrackOpenResources=true&defaultFetchSize=10000&useCursorFetch=true --username iretl --password-file /irdw/${env}/lib/.passwordBaseDw --table agg_daily_activity_performance_stage -m 1 --query "SELECT SUM(click_count) FROM agg_daily_activity_performance_stage WHERE \$CONDITIONS GROUP BY 1" --target-dir /irdw/${env}/legacy/agg/activity_performance/text/shard_${shard}

==========

==========


3881 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Error parsing arguments for import:
3881 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: SUM(click_count)
3881 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: FROM
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: agg_daily_activity_performance_stage
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: WHERE
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: \$CONDITIONS
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: GROUP
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: BY
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: 1"
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: --target-dir
3882 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: /irdw/test/legacy/agg/activity_performance/text/shard_0


推荐答案

我能够得到这个工作。解决方法是将所有查询元素作为单独的参数提交。什么都不应该在命令窗口中。相反,从import作为第一个参数开始,将查询的每个部分作为单独的参数输入。每个元素的属性和值都作为单独的参数输入。例如:

I was able to get this working. The solution is to submit all of the query elements as separate arguments. Nothing should be in the "Command" window. Instead, starting with "import" as the first argument, enter each part of the query as a separate argument. Properties and values for each element are entered as separate arguments. For example:


arg:  import
arg:  --connect
arg:  jdbc:mysql....
arg:  --username
arg:  [username]
arg:  --password-file
arg:  [password file]
arg:  --query
arg:  select .....
arg:  --target-dir
arg:  [target]

工作流程按预期执行。

这篇关于Sqoop自由格式查询在Hue / Oozie中导致无法识别的参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 09:51