In the
Using ElasticSearch for storing ouput of Pig Script , i built a sample for storing output of Pig Script into ElasticSearch. I wanted to try out the reverse, in which i wanted to use Index/Search Result in elastic search as input into Pig Script, so i built this sample
- First follow step 3 in the Using ElasticSearch for storing ouput of Pig Script to download and upload the ElasticSearch Hadoop jars into HDFS store.
- After that create a pig script like this,
In this script first 2 lines are used to make the ElasticSearch Hadoop related jars available to Pig. Then the DEFINE statement is creating alias for
org.elasticsearch.hadoop.pig.EsStorage
and giving it a simple/user friendly name of ES. Then the 4th line is telling Pig to load the content of pig/cricket
index on local elastic search into variable A. The last line is used for dumping content of variable A.
REGISTER /user/root/elasticsearch-hadoop-2.0.0.RC1/dist/elasticsearch-hadoop-2.0.0.RC1.jar
REGISTER /user/root/elasticsearch-hadoop-2.0.0.RC1/dist/elasticsearch-hadoop-pig-2.0.0.RC1.jar
DEFINE ES org.elasticsearch.hadoop.pig.EsStorage;
A = LOAD 'pig/cricket' USING ES;
DUMP A;
After i executed the script i could see the output like this
Note: Before i got it to work i was using
v = LOAD 'pig/cricket' USING org.elasticsearch.pig.EsStorage
command to load the content of ES and it kept throwing the following error. I realized that i was using the wrong package name
grunt> v = LOAD 'pig/cricket' USING org.elasticsearch.pig.EsStorage;
2014-05-14 15:56:48,873 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.elasticsearch.pig.EsStorage using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Details at logfile: /root/pig_1400106825043.log
No comments:
Post a Comment