How to run Sqoop command from oozie

In the Importing data from Sqoop into Hive External Table with Avro encoding updated i blogged about how you can use sqoop to import data from RDBMS into Hadoop. I wanted to test if i can use Oozie for invoking Sqoop command and i followed these steps for doing that.
  1. First i tried executing this command from my command line on Hadoop cluster to make sure that i can actually run sqoop without any problem
    
    sqoop import --connect jdbc:mysql://localhost/test 
    --username root 
    --password cloudera 
    --table CUSTOMER 
    --as-avrodatafile
    
  2. Once the sqoop command was successfully executed i went back and deleted the CUSTOMER directory from HDFS to make sure that i could re-import data using following command
    
    hdfs dfs -rm -R CUSTOMER
    
  3. Next i went to Hue to create oozie workflow with single sqoop command that i had executed before
    But if your not using the Hue console you can create workflow.xml manually like this Also make sure to create job.properties file like this Take a look at Enabling Oozie console on Cloudera VM 4.4.0 and executing examples for information on how to run oozie job from command line
  4. Next when i ran the Oozie workflow, the job failed with following error, which indicates that Oozie does not have the MySQL JDBC driver.
    
    java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver
     at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:875)
     at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
     at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763)
     at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786)
     at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289)
     at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260)
     at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:246)
     at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:327)
     at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1846)
     at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1646)
     at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
     at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
     at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
     at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
     at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
     at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
     at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
     at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
     at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197)
     at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177)
     at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49)
    
  5. So first thing i did was to check if mysql driver is there in the oozie shared lib by executing following commands
    
    export OOZIE_URL=http://localhost:11000/oozie
    oozie admin -shareliblist sqoop
    
    I noticed that the mysql-connector-java.jar was not there in the list of shared libs for Oozie + sqoop
  6. Next step was to find the mysql-connector-java.jar in my sandbox that i could do by finding it like this
    
    sudo find / -name mysql*
    
    I found mysql-connector-java.jar on my local machine at /var/lib/sqoop/mysql-connector-java.jar
  7. I wanted to update the Oozie shared lib to include the mysql driver jar. So i executed following command to figure out the directory where the oozie sqoop shared lib is
    
    oozie admin -sharelibupdate
    
    From this output i got HDFS directory location for Oozie shared lib which is /user/oozie/share/lib/lib_20160406022812
  8. Then i used following two commands to first copy the db driver into the oozie shared lib and making sure it is accessible to other users hdfs -copyFromLocal /var/lib/sqoop/mysql-connector-java.jar /user/oozie/share/lib/sqoop/. hdfs dfs -chmod 777 /user/oozie/share/lib/sqoop/mysql-connector-java.jar
  9. Now the last step was to let Oozie know that it should reload the sharedlib and i did that by executing following two commands
    
    oozie admin -sharedlibupdate
    oozie admin -shareliblist sqoop | grep mysql*
    
    The second command queries oozie to get current list of shared jars and i could see mysql-connector-java.jar listed in it like this
When i re-executed the ooize job again this time it ran successfully.

11 comments:

mcafee.com activate said...

McAfee provides security for all sorts of users. They supply services and products for home and office at home, enterprise businesses with over 250 workers, and small organizations with under 250 employees, and also venture opportunities.

Install Office said...


We are providing help and support for Microsoft office Setup and activation. Call us or email us the error or problem, our one of the expert contact you with the suitable perfect solution. Get the MS Office application suite and as per your need and see how it is easy to work with Microsoft Office.

Tejuteju said...

It was the very nice article and it is very useful Big data Hadoop online Course Bangalore

jolly said...

good ..
Avast Support Number

jolly said...

good!!
www.office.com/setup

Garmin Gps Map Update said...

I really happy found this website eventually. Really informative and inoperative, Thanks for the post and effort! Please keep sharing more such blog. If your device is still not working, you can take the help of an expert to get rid of this error. Don’t hesitate to talk with professionals whenever you face this issue, or any other glitch. We can help you use the Garmin GPS update with perfection.

Catherine said...

All in one solution for readers business: supreme Package click on website and logo design company

Grayson Dolan said...

We are really grateful for your blog post. You will find a lot of approaches after visiting your post. I was exactly searching for. Thanks for such post and please keep it up. Thanks for sharing! Know about fubo.tv/activate.

Techsolve Solutions said...

Thanks for sharing useful information. Keep on sharing.

Web designing companies in toronto
Digital marketing agency toronto

Anonymous said...

Thank you so much for the perfect blog. hope to see this blog. and it's beneficial for others. your traffic ticket

Anonymous said...

digital content marketing Services