- FIrst thing that i did was to download the WordCount program source code by executing
This program does have maven script for building executable jar, so i usedgit clone https://github.com/sdpatil/HadoopWordCount3
mvn clean package
command to build Hadoop jar. -
After that i tried executing the program manually by using following following command
hadoop jar target/HadoopWordCount.jar sorttest.txt output/wordcount
- Now in order to use Oozie workflow you will have to create a particular folder structure on your machine
wordcount -- job.properties -- workflow.xml -- lib -- HadoopWordCount.jar
-
In the workcount folder create job.properties file like this, This file lets you pass parameters to your oozie workflow. Value of
nameNode
andjobTracker
represent the name node and job tracker location. In my case i am using cloudera vm with single ndoe so both these properties point to localhost. The value ofoozie.wf.application.path
is equal to HDFS path where you uploaded the wordcount folder created in step 3 -
Next define your Apache oozie workflow.xml file like this. In my case the workflow has single step which is to execute mapreduce job. I am
- mapred.mapper.new-api & mapred.reducer.new-api: Set this property to true if your using the new MapReduce API based on
org.apache.hadoop.mapreduce.*
classes - mapreduce.map.class: The fully qualified name of your mapper class
- mapreduce.reduce.class: The fully qualified name of your reducer class
- mapred.output.key.class: Fully qualified name of the output key class. This is same as parameter to
job.setOutputKeyClass()
in your driver class - mapred.output.value.class: Fully qualified name of the output value class. This is same as parameter to
job.setOutputValueClass()
in your driver class - mapred.input.dir: Location of your input file in my case i have sorttext.txt in hdfs://localhost/user/cloudera directory
- mapred.output.dir:Location of output file that will get generated. In my case i want output to go to hdfs://localhost/user/cloudera/output/wordcount directory
- mapred.mapper.new-api & mapred.reducer.new-api: Set this property to true if your using the new MapReduce API based on
-
Once your oozie workflow is ready upload the wordcount folder in HDFS by executing following command
hdfs dfs -put oozie wordcount
-
If it runs successfully you should see output generated inNow run your oozie workflow by executing following command from your wordcount directory oozie job -oozie http://localhost:11000/oozie -config job.properties -run
hdfs://localhost/user/cloudera/output/wordcount
directory
Using Apache Oozie to execute MapReduce jobs
I wanted to learn about how to automate MapReduce job using Oozie, so i decide to create Oozie workflow to invoke WordCount(HelloWorld) MapReduce program. I had to follow these steps
Subscribe to:
Post Comments (Atom)
7 comments:
Cool summary!
It was the very nice article and it is very useful Big data Hadoop online training
Much obliged for your article. It was intriguing and useful.
Here I additionally need to recommend your peruser who normally heads out from one spot to another they should visit Airlines Gethuman that offer the best arrangements to book your seat on Delta Airlines Reservations. Hurry do as well and benefits the best arrangements and dispose of to check various sites for offers.
Get more help on
Southwest Airlines Flights
Delta Airlines Ticketing
Thanks for your article. It was interesting and informative.
Here I also want to suggest your reader who usually travels one place to another they must visit Airlines Gethuman that offer best deals to book your seat on Delta Airlines Reservations. So do hurry and avail the best deals and get rid of to check different websites for offers
Southwest Airlines Reservations
Really loved this website content and design, visit Niagara Cab Company
niagara falls cab company
I hope to see more post from you. Thank you for sharing this post. Your blog posts are more interesting and impressive
TN Elections Portal, TN Voter Id Registration, Apply Online, Status and List Check, elections.tn.gov.in
What makes Delta Airlines so great is the fact that the airline provides great customer service over the phone and online. You will find Delta Airlines Office in Kampala. Delta has deployed a team of experts in its office that has answers to all the questions and queries of customers.
Post a Comment