-
First thing that i did was to create WordCountCombiner.java class that looks same as that of
WordCountReducer
, but i did add oneSystem.Out.println()
in it so that i would know when my combiner is called instead of reducer. -
Then i changed the driver class for my MapReduce framework class to add
job.setCombinerClass(WordCountCombiner.class);
line in it.
How to create custom Combiner class with your MapReduce framework
The MapReduce framework passes control to your combiner class at the end of the map phase to combine different output files generated by Mappers, so that your combiner class combines/reduce the data generated by Mappers before it gets transferred to the Reducers. Sending data from Mapper to reducer requires that data to go over network from Mapper to Reducer.
I wanted to try creating custom combiner class, In order to keep things simple i decided to add combiner class in WordCount(HelloWorld) MapReduce program . Basically my combiner class does same thing as reducer, which is to take multiple [word, 1] tuples and combine them into something like [word1, 5], [word2, 6],,, etc. I followed these steps
Subscribe to:
Post Comments (Atom)
3 comments:
Thanks for the great explanation! Running MapReduce framework for massive data processing on a cluster of commodity hardware requires enormous resource, especially high CPU and memory occupation. To enhance the commodity hardware performance without physical update and topology change, the highly parallel and dynamically configurable FPGA can be dedicated to provide feasible supplements in computation running as coprocessor to CPU. More at www.youtube.com/watch?v=1jMR4cHBwZE
Thanks for info....
Website development in Bangalore
Very nice post,thank you for sharing this awesome blog with us.
keep sharing more...
Big data hadoop certification
Big data and hadoop online training
Post a Comment