MapReduce program that reads input files from S3 and writes output to S3

In the WordCount(HelloWorld) MapReduce program entry i talked about how to create a simple WordCount Map Reducer program with Hadoop. I wanted to change it to so that it reads input files from Amazon S3 bucket and writes output back to Amazon S3 bucket, so i built S3MapReduce program, that you can download from here. I followed these steps
  1. First create 2 buckets one for storing input and other for storing output in your Amazon S3 account. Most important issue here is to make sure that you create your buckets in US Standard region, if you dont do that then additional steps might be required for Hadoop to be able to access your buckets Name of input bucket in my case is com.spnotes.hadoop.wordcount.books
    Name of the output bucket is com.spnotes.hadoop.wordcount.output
  2. Upload few .txt files that you want to use as input in your input bucket like this
  3. Next step is to create MapReduce program like this, In my case one Java class has code for Mapper, Reducer and driver class. Most of the code in the MapReduce is same only difference is for working with S3 you will have to add few S3 specific properties like this, basically you need to set your accessKey and secretAccessKey that you can get from AWS Security console and paste it here. You will also have to tell Hadoop to use s3n as file system.
    //Replace this value
    job.getConfiguration().set("fs.s3n.awsAccessKeyId", "awsaccesskey");
    //Replace this value
  4. Now last step is to execute this program, it takes 2 inputs, You can just right click on your S3MapReduce program and say execute with following 2 parameters
    s3n://com.spnotes.hadoop.wordcount.books s3n://com.spnotes.hadoop.wordcount.output/output3
  5. Once the MapReduce is executed you can check the output by going to S3 console and looking at content of com.spnotes.hadoop.wordcount.output like this


Prakash Raju Meka said...

How to configure in case of having two different keys for two(READ and WRITE) buckets ?

srjwebsolutions said...

We are leading responsive website designing and development company in Noida.
We are offering mobile friendly responsive website designing, website development, e-commerce website, seo service and sem services in Noida.

Responsive Website Designing Company in Noida
Website Designing Company in Noida
SEO Services in Noida
SMO Services in Noida

Vikas Chaudhary said...

Battery Mantra is Authorized exide car battery dealer in Noida and Greater Noida. We are providing our service in Indirapuram, Delhi, Ashok Nagar.

Exide Battery Dealer in Noida
Battery Dealer in Noida
Authorized Battery Dealer in Noida
Car Battery Dealer in Noida
Car Battery Dealer
Exide Battery Dealer

EG MEDI said... is online medical store pharmacy in laxmi nagar Delhi. You can Order prescription/OTC medicines online. Cash on Delivery available. Free Home Delivery

Online Pharmacy in Delhi
Buy Online medicine in Delhi
Online Pharmacy in laxmi nagar
Buy Online medicine in laxmi nagar
Onine Medical Store in Delhi
Online Medical store in laxmi nagar
Online medicine store in delhi
online medicine store in laxmi nagar
Purchase Medicine Online
Online Pharmacy India
Online Medical Store

mounika said...

This blog gives good info I really enjoy this post AWS Online Training