For last couple of months i have been using Yarn framework for running my mapreduce jobs. Normally using Yarn is transparent so i did not have to do any thing different but just change my mapred-site.xml file to set value of
mapreduce.framework.name
to
yarn
like this.
But YARN affects how the logs and job history gets stored. For example if your using traditional map reduce framework you can go to
http://localhost:50030
to look at the job and task history and also access the logs generated by mapreduce framework. In case of Yarn you will have to go to
http://localhost:8088/cluster
and it will take you to Resource Manager Home page like this, there you should see list of applications and then click on the name of the application and to get more details
When you try to look at the logs for application, it takes you the nodemanager home page like this
Since i am working on single node cluster i like to go to the hadoop log directory and there under userlogs directory i can see log folders for each application. The application folder is subdivided into container folder one for mapper task one for reducer task and one for driver task and each of the container folders has one file for stdout, stderr and syslog that contains more output. If you have any
System.out.println()
in your mapper or reducer class you should find the appropriate container folder and stdout file in that container should have output that you generated using
System.out.println()
No comments:
Post a Comment