Creating custom Partitioner class for your mapreduce program

The MapReduce framwork uses instance of org.apache.hadoop.mapreduce.Partitioner class to figure our which mapreduce output key goes to which reducer. By default it uses org.apache.hadoop.mapreduce.lib.partition.HashPartitioner, this class calculates hash value for the key and divides it by number of Reducers in the program and uses remainder to figure out the reducer it goes to. This implementation is pretty good and as long as the keys generate hashCodes that gives uniform distribution it should be good. But in some exceptional cases you might want to take control of how the output of Mapper gets distributed to Reducers. I just wanted to figure out how this works, so i decided to change my WordCount(HelloWorld) MapReduce program to add a custom Partitioner that sends upper and lower case alphabets two 2 different reducers. I followed these steps
  1. First i did create a class like this First thing i am doing is checking if there are 2 reducers if yes i am using the first letter of the key to figure out if it starts with lower case letter (simply check it against 'a' letter, if yes send it to first reducer if not send it to second reducer
  2. I had to make few changes in the Driver program to use my WordCountPartitioner
    • job.setNumReduceTasks(2): This call is asking MapReduce framework to use 2 reducers
    • job.setPartitionerClass(WordCountPartitioner.class); This call is setting my WordCountPartitioner as the class for partitioner
This screen shot shows how my sample.txt got divided into 2 reducer outputs. First 2 lines show output with default HashPartitioner and 2nd 2 lines show output when i used my custom Partitioner


Prashant said...

Nice Article Sunil :)

Yogesh Choudhary said...

Good Article !!
Nicely explained.

sai said...

well explained :) thank you bro :)

jack AKA karthik said...

Keep going

Vaishali Parmar said...
This comment has been removed by the author.
Vaishali Parmar said...

Really helpful...Thanks

Geetha Devi said...

I just want to know about Map Reduce program and found this post this post is perfect one ,Thanks for sharing the informative post of Map Reduce and able to understand the concepts easily,Thoroughly enjoyed reading
Check out the

sowmiya gopal said...

hii that post on hadoop mapreduce was really good and informative do posting your blog as i regularly read your blog to clear my doubts Hadoop Training in Velachery | Hadoop Training .
Hadoop Training in Chennai | Hadoop .

Vikas Chaudhary said...

Battery Mantra is Authorized exide car battery dealer in Noida and Greater Noida. We are providing our service in Indirapuram, Delhi, Ashok Nagar.

Exide Battery Dealer in Noida
Battery Dealer in Noida
Authorized Battery Dealer in Noida
Car Battery Dealer in Noida
Car Battery Dealer
Exide Battery Dealer

EG MEDI said... is online medical store pharmacy in laxmi nagar Delhi. You can Order prescription/OTC medicines online. Cash on Delivery available. Free Home Delivery

Online Pharmacy in Delhi
Buy Online medicine in Delhi
Online Pharmacy in laxmi nagar
Buy Online medicine in laxmi nagar
Onine Medical Store in Delhi
Online Medical store in laxmi nagar
Online medicine store in delhi
online medicine store in laxmi nagar
Purchase Medicine Online
Online Pharmacy India
Online Medical Store