本文介绍了Hive执行钩子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 我需要在Apache Hive中挂接一个自定义执行钩子。请让我知道如果有人知道如何做到这一点。 当前使用的环境如下: Hadoop:Cloudera 4.1.2版本操作系统:Centos 感谢, Arun 解决方案 驱动程序运行钩子(Pre / Post ) 语义分析器钩子(前/后) 执行钩子(前/失败/后期) >客户统计发布者 如果您运行脚本,处理流程如下所示: Driver.run()接收命令 HiveDriverRunHook.preDriverRun() ( HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS ) Driver.compile()开始处理命令:创建摘要语法树 AbstractSemanticAnalyzerHook.preAnalyze() ( HiveCon语义分析 AbstractSemanticAnalyzerHook.postAnalyze() code> ( HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK ) 创建并验证查询计划物理计划) Driver.execute():准备好运行作业 ExecuteWithHookContext.run() ( HiveConf.ConfVars.PREEXECHOOKS ) ExecDriver.execute()作业 对于每个HiveConf.ConfVars.HIVECOUNTERSPULLINTERVAL区间的每个作业: ClientStatsPublisher.run()是调用以发布( HiveConf.ConfVars.CLIENTSTATSPUBLISHERS ) 如果任务失败: ExecuteWithHookContext .run() ( HiveConf.ConfVars.ONFAILUREHOOKS ) 完成所有任务 ExecuteWithHookContext.run() ( HiveConf.ConfVars.POSTEXECHOOKS 在返回结果 HiveDriverRunHook.postDriverRun() ( HiveConf.ConfVars .HIVE_DRIVER_RUN_HOOKS ) 返回结果。 我指出了你必须实现的接口。在括号中有相应的conf。支柱。键必须设置为在脚本的开头注册类。 例如:设置PreExecution钩子(工作流程的第9个阶段) HiveConf.ConfVars.PREEXECHOOKS - > hive.exec.pre.hooks: set hive.exec.pre.hooks = com.example.MyPreHook; 不幸的是,这些功能没有真正记录,但您可以随时查看 Driver class看看评价顺序的钩子。 备注:我在这里假设Hive 0.11.0,我不认为Cloudera的分布不同很多) I am in need to hook a custom execution hook in Apache Hive. Please let me know if somebody know how to do it.The current environment I am using is given below:Hadoop : Cloudera version 4.1.2Operating system : CentosThanks,Arun 解决方案 There are several types of hooks depending on at which stage you want to inject your custom code:Driver run hooks (Pre/Post)Semantic analyizer hooks (Pre/Post)Execution hooks (Pre/Failure/Post)Client statistics publisherIf you run a script the processing flow looks like as follows:Driver.run() takes the commandHiveDriverRunHook.preDriverRun()(HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)Driver.compile() starts processing the command: creates the abstract syntax treeAbstractSemanticAnalyzerHook.preAnalyze()(HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)Semantic analysisAbstractSemanticAnalyzerHook.postAnalyze()(HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)Create and validate the query plan (physical plan)Driver.execute() : ready to run the jobsExecuteWithHookContext.run()(HiveConf.ConfVars.PREEXECHOOKS)ExecDriver.execute() runs all the jobsFor each job at every HiveConf.ConfVars.HIVECOUNTERSPULLINTERVAL interval: ClientStatsPublisher.run() is called to publish statistics (HiveConf.ConfVars.CLIENTSTATSPUBLISHERS) If a task fails: ExecuteWithHookContext.run() (HiveConf.ConfVars.ONFAILUREHOOKS)Finish all the tasksExecuteWithHookContext.run() (HiveConf.ConfVars.POSTEXECHOOKS)Before returning the result HiveDriverRunHook.postDriverRun() ( HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)Return the result.For each of the hooks I indicated the interfaces you have to implement. In the bracketsthere's the corresponding conf. prop. key you have to set in order to register theclass at the beginning of the script.E.g: setting the PreExecution hook (9th stage of the workflow)HiveConf.ConfVars.PREEXECHOOKS -> hive.exec.pre.hooks :set hive.exec.pre.hooks=com.example.MyPreHook;Unfortunately these features aren't really documented, but you can always look into the Driver class to see the evaluation order of the hooks.Remark: I assumed here Hive 0.11.0, I don't think that the Cloudera distributiondiffers (too much) 这篇关于Hive执行钩子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
08-24 05:26