Reading content of ElasticSearch index into Pig Script

In the Using ElasticSearch for storing ouput of Pig Script , i built a sample for storing output of Pig Script into ElasticSearch. I wanted to try out the reverse, in which i wanted to use Index/Search Result in elastic search as input into Pig Script, so i built this sample
  1. First follow step 3 in the Using ElasticSearch for storing ouput of Pig Script to download and upload the ElasticSearch Hadoop jars into HDFS store.
  2. After that create a pig script like this, In this script first 2 lines are used to make the ElasticSearch Hadoop related jars available to Pig. Then the DEFINE statement is creating alias for org.elasticsearch.hadoop.pig.EsStorage and giving it a simple/user friendly name of ES. Then the 4th line is telling Pig to load the content of pig/cricket index on local elastic search into variable A. The last line is used for dumping content of variable A.
    
    REGISTER /user/root/elasticsearch-hadoop-2.0.0.RC1/dist/elasticsearch-hadoop-2.0.0.RC1.jar
    REGISTER /user/root/elasticsearch-hadoop-2.0.0.RC1/dist/elasticsearch-hadoop-pig-2.0.0.RC1.jar
    
    DEFINE ES org.elasticsearch.hadoop.pig.EsStorage;
    A = LOAD 'pig/cricket' USING ES;
    DUMP A;
    
After i executed the script i could see the output like this
Note: Before i got it to work i was using v = LOAD 'pig/cricket' USING org.elasticsearch.pig.EsStorage command to load the content of ES and it kept throwing the following error. I realized that i was using the wrong package name

grunt> v = LOAD 'pig/cricket' USING org.elasticsearch.pig.EsStorage;
2014-05-14 15:56:48,873 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.elasticsearch.pig.EsStorage using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Details at logfile: /root/pig_1400106825043.log

No comments: