Big Data - Apache Oozie

Back to Course

Lesson Description

Lession - #750 Working of Apache Oozie

Basically, Oozie is a service that runs in the cluster. Workflow definitions are submitted by the clients for immediate processing. There are two nodes, namely, control- flow nodes and action nodes.

The action node is the one representing workflow tasks such as running a MapReduce task, importing data, running a Shell script, etc.

Next, the control- flow node is responsible for controlling the workflow execution in between actions. This is done by allowing constructs like conditional logic. The control- flow node includes a start node( used for starting a workflow job>
, an end node( designating the end of a job>
, and an error node( pointing to an error if any>

At the end of the workflow, HTTP callback is used by Oozie for updating the client with the workflow status.