With the assumption that Oozie has been installed/configured as mentioned here and that a simple work flow can be executed as mentioned here, now it's time to look at how to schedule the work flow at regular interval using Oozie.
- Create the coordinator.properties file in HDFS (oozie-clickstream-examples/apps/scheduler/coordinator.properties)
- For some reason Oozie is picking 1 hour behind the system time.This can be observed from the `Created` time of the latest submitted work flow and the system time from the top right in the below screen. So, the start and the end time in the previous step had to be backed by an hour to the actual times.
Not sure why this happens, but will update the blog if the cause is found out or someone posts why in the comments.
-Submit the Oozie coordinator job as
- The output should appear as below in the `oozie-clickstream-examples/finaloutput/000000_0` file in HDFS.
- Create the coordinator.properties file in HDFS (oozie-clickstream-examples/apps/scheduler/coordinator.properties)
nameNode=hdfs://localhost:9000 jobTracker=localhost:9001 queueName=default examplesRoot=oozie-clickstream-examples examplesRootDir=/user/${user.name}/${examplesRoot} oozie.use.system.libpath=true oozie.coord.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/scheduler- Create the coordinator.xml file in HDFS (oozie-clickstream-examples/apps/scheduler/coordinator.xml). The job runs between the specified start and the end time interval for every 10 minutes.
<coordinator-app name="wf_scheduler" frequency="10" start="2013-10-24T22:08Z" end="2013-10-24T22:12Z" timezone="UTC" xmlns="uri:oozie:coordinator:0.1"> <action> <workflow> <app-path>${nameNode}${examplesRootDir}/apps/cs</app-path> </workflow> </action> </coordinator-app>Note that the Oozie coordinator can be time based or event based with much more complexity than as mentioned here. Here are the specifications for the Oozie coordinator.
- For some reason Oozie is picking 1 hour behind the system time.This can be observed from the `Created` time of the latest submitted work flow and the system time from the top right in the below screen. So, the start and the end time in the previous step had to be backed by an hour to the actual times.
Not sure why this happens, but will update the blog if the cause is found out or someone posts why in the comments.
-Submit the Oozie coordinator job as
bin/oozie job -oozie http://localhost:11000/oozie -config /home/vm4learning/Code/oozie-clickstream-examples/apps/scheduler/coordinator.properties -run-The coordinator job should appear in the Oozie console from the PREP to RUNNING, all the way to SUCCEEDED state.
- The output should appear as below in the `oozie-clickstream-examples/finaloutput/000000_0` file in HDFS.
www.businessweek.com 2 www.eenadu.net 2 www.stackoverflow.com 2Note that Oozie has got a concept of bundles where a user can batch a group of coordinator applications and execute an operation on the the whole bunch at a time. Will look into it in another blog entry.
Nice post!
ReplyDeleteYou can also now enjoy a better Oozie UI with Hue ;) (e.g. http://gethue.tumblr.com/tagged/oozie)
Planned to install Hue, but the documentation mentions CDH as a prerequisite. Not sure if it works with frameworks from Apache. Looks the same is the case with Imapala also.
DeleteThe cause of the delay may be that it needs the times in UTC and UTC only. I have seen this referred in Hue documentation, maybe it is Oozie's limitation.
ReplyDeleteHello Sir, I am have some problem with Ozzie job run time.Could you help me ?
ReplyDeleteI have asked question stack overflow as well.
Please need your help.
http://stackoverflow.com/questions/43465972/org-apache-hadoop-security-authorize-authorizationexception-user-developer4-is
Hi,
ReplyDeleteI am really frustrated with oozie logs. It doesn't throw proper error information. whatever mistake i made it always throw following message
Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
I enabled "Trace" log level in cloudera admin page.But still no useful information i found.
can please tell me is there any other way to debug the oozie job?