本文原文出处: http://blog.csdn.net/bluishglc/article/details/47021019 严禁不论什么形式的转载,否则将托付CSDN官方维护权益!

First, let’s make the topic clear:

Comparing with providing raw Oozie workflow/coordinator xml file, what’s disadvantages to create workflow/coordinator with Hue Oozie Editor? ( The Hue Oozie Editor version discussed by this artical is HDP 2.2.4)

If no deep understanding with Hue Oozie Editor, everybody will like it at the first glance, why not? It’s so easy to use, what you see is what you get, who want to write the ugly xml file manually?

But the truth is: the Hue Oozie Editor is not so good as it looks, it’s far away to stable and powerful tool to create/manage workflows.

Here are the problems:

  1. As core source codes for workflows, the raw xml files should add into version control. The Hue Oozie Editor have no or very weak version control ability.

  2. If we maintain raw xml file in project, with building tools, we can configure environment related parameters, i.e. namenode, input/output data location and etc. And then we can easily build project for dev, test or production environment.

On the contrary, what if using Hue Oozie Editor? Congratulations! Please do the duplicated job again on production cluster: re-create the workflows/coordinators on the production cluster manually. Well, there’s an import/export feature in Hue Oozie Editor, but it’s only for workflow not for coordinator, and even for workflows, you still have to change all environment related parameters manually.

  1. It can’t support some advanced features, so we have to edit raw xml file. For example: you can’t assign the expression between input-events and dataset, i.e. ${coord:current(-1)}, you can only map them directly.

  2. Can’t import/export coordinators.

Well, at least, could we import our raw workflow file into Hue Oozie Editor?

Let’s look at how weak the current Hue Oozie Editor:

  1. For schema version, Hue Oozie Editor only support not higher than 0.4 of workflow and not higher than 0.2 of hive-action, otherwise you can’t import your raw file.

  2. It’s hard to believe: the property name: jobTracker and nameNode are HARD CODE! If you don’t use the two property name, again, you can’t import your raw file.

  3. Some parameters accept embedded parameter, i.e. ${nameNode}/data/${year}/${month}, but some not, as for which accept which not? Try by yourself one by one, otherwise, you can’t import your raw file still.

Nobody hate UI design tools, but it has to be good enough. by now, I would say, building workflows above Hue Oozie Editor is unwise.

Obviously, we should choose raw xml file not Hue Oozie Editor.

But there is only one small trouble, the Hue can only start a workflow/coordinator edited by Hue Oozie Editor.Note: once a workflow/coordinator started, you can monitor & stop it from Hue even it’s described by raw xml.

First, I don’t think this is a trouble, we can start a workflow/coordinator with command line. Please do remember: normally, a workflow/coordinator is long-term running & background service, we scarcely start/stop it. So, the command line is enough for the operation and maintenance.

Besides command line, you can also start a workflow/coordinator via Oozie Restful API from remote.

05-12 10:37