-
First i started by creating a simple Mapper which receives the content of the text file one line at a time, the Mapper takes care of splitting the content into words and then it writes every word into output and sets frequency count for that word to 1, by calling
context.write(word,one)
. In this case the word becomes key and count becomes value -
Next i had to develop a Reducer class which, receives word as key and value is list of all the counts for example if your input file is simple text like
aaa bbb ccc aaa
, then reduce class will get called withaaa - [1, 1]
,bbb -[1]
andccc - [1]
as input. Hadoop framework takes care of collecting output of Mapper and then converting it intokey -[value,value]
format. In the reducer only thing that i had to do was to iterate through all the values and come up with a count. Once i have that i write it as output of Reducer by callingcontext.write(key, new IntWritable(sum));
-
The last part is creating WordCountDriver.java, which is a Java program that sets up Hadoop Framework by setting up inputs, defining outputs and also specifying name of the Mapper and Reducer class. After initializing Hadoop it calls
job.waitForCompletion(true)
, this method will take care of passing the control to Hadoop framework and wait for the job to complete -
Now you can either use one of the existing .txt file on your machine or you can create a text file like this
hhh eee iii bbb ccc fff ddd ggg aaa aaa XXX YYY ZZZ hhh eee iii bbb ccc fff ddd ggg aaa aaa XXX YYY ZZZ hhh eee iii bbb ccc fff ddd ggg aaa aaa XXX YYY ZZZ hhh eee iii bbb ccc fff ddd ggg aaa aaa hhh eee iii bbb ccc fff ddd ggg aaa aaa
-
Last step is to run your Hadoop program, if you used the Eclipse or some other IDE for developing your code, you can run your program directly by running WordCountDriver.java directly. This program takes 2 parameters, in my case since the input file is on local file system and i want the output to get stored on local file system too, i pass following 2 parameters
file:///Users/sunil/hadoop/sorttest.txt file:///Users/sunil/hadoop/output/wordcount
- Once the program is finished successfully, you would be able to see part-r-00000 file created on your local machine at
/Users/sunil/hadoop/output/wordcount
, if you open it you should see output like thisXXX 3 YYY 3 ZZZ 3 aaa 10 bbb 5 ccc 5 ddd 5 eee 5 fff 5 ggg 5 hhh 5 iii 5
WordCount(HelloWorld) MapReduce program
I am learning about MapReduce and in order to experiment with MapReduce, i created this simple program which takes a text file as input and then generate a output that prints how frequently a word appeared in the text file. You can download the source code for the program from here
Subscribe to:
Post Comments (Atom)
40 comments:
Really it was a good example .helped me a lot as a beginer .
There are lots of information about latest technology and how to get trained in them, like Best Hadoop Training in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Best hadoop training institute in chennai). By the way you are running a great blog. Thanks for sharing this.
Big Data Course in Chennai | Big Data Training Chennai
Thanks Sunil,
Please explain where I need to execute Mapper and Reducer program??? Is there any type of utility where I will execute ???
Really great post.Thanks for sharing this blog.It helps me to get a good job.Keep sharing.
Regards
Hadoop Training chennai | Hadoop Training in chennai
Excellent article. Hadoop is a cloud based tool.It give more information about massive storage and it helps to improve our skills. Hadoop provides more job opportunities.To achieve a great career join with us.
Regards,
Arjun
Hadoop Training Chennai|Big Data Training Chennai
Big data is the next big thing in the information technology space. According to a recent survey there is a huge demand for professional big data analysts who are capable of processing large data so that the enterprise objective are met. Join Fita and get trained from the professional big data analysts who are working for corporates. Join FITA and stay ahead in your career.
Thanks,
Hadoop Training in Chennai | Bigdata Training in Chennai
Thanks for giving Good Example. Fantastic article, Viral. Very well written, clear and concise. One of the best links explaining one to many and hierarchy in Hadoop.
Big data Hadoop Training
Great!! This is such an informative content. It will help for Beginner. Keep it up.
Best Tally Developer Training in Delhi
Best Tally ERP 9 Training in Delhi
In order to write MapReduce applications you need to have an understanding of how data is transformed as it executes in the MapReduce framework.
Java Certification Training in Chennai
Learned a lot of new things from your post! Good creation and HATS OFF to the creativity of your mind. Very interesting and useful blog!
Java Training in Chennai
Java Course in Chennai
Best Java Training in Chennai
thanks for sharing
Best Linux Training Institute in Chennai
good explaination about hadoop and map reduce ,
i found more resources where you can find tested source code of map reduce programs
refere this
top 10 map reduce program sources code
top 10 Read Write fs program using java api
top 30 hadoop shell commands
Hi ,Your post on MapReduce Program was easy to understand I used the codes with little modifications,thanks for your post Hadoop Training in Velachery | Hadoop Training .
Hello,
Great post! I am see the programming coding and step by step execute the outputs.I am gather this coding more information. It's helpful for me. Also great blog here with all of the valuable information you have. Know More Info About The Best MapReduce Certification.
What a fantastic read on Hadoop. This has helped me understand a lot in Hadoop course. Please keep sharing similar write ups on Hadoop. Guys if you are keen to knw more on Hadoop, must check this wonderful Hadoop tutorial and i'm sure you will enjoy learning on Hadoop training.https://www.youtube.com/watch?v=1jMR4cHBwZE
Much obliged for sharing such an awesome information.Its extremely pleasant and useful.
Education | Article Submission sites | Technology
Great and useful blog admin, I would like to read more. Your step by step coding is really understandable. Continue sharing more like this.
Hadoop Training Chennai | Big Data Training Chennai | Best Hadoop Training in Chennai
Thanks for your article. Its very helpful. Hadoop training in chennai | Hadoop Training institute in chennai
very informative blog and useful article thank you for sharing with us, keep posting Big data hadoop online Course
Thanks for sharing the valuable information, keep sharing.
Regards,
Hadoop Training Chennai|Big Data Training in Chennai
Very clear explanation. Please share more like that..
RPA Training in Hyderabad
Very Impressive Big Data Hadoop tutorial. The content seems to be pretty exhaustive and excellent and will definitely help in learning Big Data Hadoop course. I'm also a learner taken up Big Data Hadoop Tutorial and I think your content has cleared some concepts of mine. While browsing for Hadoop tutorials on YouTube i found this fantastic video on Big Data Hadoop Tutorial.Do check it out if you are interested to know more.https://www.youtube.com/watch?v=nuPp-TiEeeQ&
The great article..thanks for sharing nice program information..
Tally Course in Delhi...
I just like the helpful information you provide in your articles. I will bookmark your blog and take a look at once more here regularly.
I am somewhat certain I’ll be informed plenty of new stuff right here! Good luck for the following!
hey...It is highly comprehensive and elaborated. Thanks for sharing!
Localebazar- Your single guide for exploring delicious foods, travel diaries and fitness stories.
Visit us for more- localebazar.com
Nice post, I would like to see more articles/blogs. I am also a content writer and writing a blog you can review it. immigration consultants in Delhi
Excellent article. Hadoop is a cloud based tool.It give more information about massive storage and it helps to improve our skills. Hadoop provides more job opportunities.To achieve a great career join with us.
Regards,
java training in chennai
java training in omr
aws training in chennai
aws training in omr
python training in chennai
python training in omr
selenium training in chennai
selenium training in omr
The Information which you provided is very much useful. Great post with unique information.
data science training in chennai
data science training in tambaram
android training in chennai
android training in tambaram
devops training in chennai
devops training in tambaram
artificial intelligence training in chennai
artificial intelligence training in tambaram
This is my first visit to your blog, your post made productive reading, thank you
data science training in chennai
data science training in annanagar
android training in chennai
android training in annanagar
devops training in chennai
devops training in annanagar
artificial intelligence training in chennai
artificial intelligence training in annanagar
Its such as you learn my mind! You appeаr tо grasp ѕo much approximately this, such as you wrote the book in it or something.
I think that you could ɗo wіth some percent to pressure the mesѕage home a little bit,
but instead of that, this iѕ excellent blog. An excellent
read. I ԝilⅼ defіnitely be back.
sap training in chennai
sap training in velachery
azure training in chennai
azure training in velachery
cyber security course in chennai
cyber security course in velachery
ethical hacking course in chennai
ethical hacking course in velachery
This is my first visit to your blog, your post made productive reading, thank yousalesforce training in chennai
software testing training in chennai
robotic process automation rpa training in chennai
blockchain training in chennai
devops training in chennai
This is my first visit to your blog, your post made productive reading, thank yousalesforce training in chennai
software testing training in chennai
robotic process automation rpa training in chennai
blockchain training in chennai
devops training in chennai
Wow! Such an amazing and helpful post this is. I really really love it. I hope that you continue to do your work like this in the future also.
Hadoop Training Institute in Pune
Hadoop Administration training institutes in Pune
I really happy found this website eventually. Really informative and inoperative, Thanks for the post and effort! Please keep sharing more such blog.
DevOps Course in Pune
Fine way of telling, and pleasant post. Nice info! Thanks a lot for sharing it, that’s truly has added a lot to our knowledge about this topic. Have a more successful day. Amazing write-up, always find something interesting.
Thanks
Nice! your blog contained very useful information for us! You explained the map reduce method very well with programmes. best python training course in delhi
I am so grateful for your article.
ETL Testing Online Training
Microservices Online Training
<a href="https://viswaonlinetrainings.com/courses/oracle-sql-and-plsql-training/>Oracle SQL&PLSQL Online Training</a>
Great job! Your blog provided incredibly valuable insights for us! The way you elucidated the map reduction technique alongside practical examples was truly enlightening.
Vist: https://www.thinkcyberindia.com/
Wow, what an incredibly informative article!
UI UX Design Schhol
Post a Comment