- First follow step 3 in the Using ElasticSearch for storing ouput of Pig Script to download and upload the ElasticSearch Hadoop jars into HDFS store.
- After that create a pig script like this,
In this script first 2 lines are used to make the ElasticSearch Hadoop related jars available to Pig. Then the DEFINE statement is creating alias for
org.elasticsearch.hadoop.pig.EsStorageand giving it a simple/user friendly name of ES. Then the 4th line is telling Pig to load the content of
pig/cricketindex on local elastic search into variable A. The last line is used for dumping content of variable A.
REGISTER /user/root/elasticsearch-hadoop-2.0.0.RC1/dist/elasticsearch-hadoop-2.0.0.RC1.jar REGISTER /user/root/elasticsearch-hadoop-2.0.0.RC1/dist/elasticsearch-hadoop-pig-2.0.0.RC1.jar DEFINE ES org.elasticsearch.hadoop.pig.EsStorage; A = LOAD 'pig/cricket' USING ES; DUMP A;
v = LOAD 'pig/cricket' USING org.elasticsearch.pig.EsStoragecommand to load the content of ES and it kept throwing the following error. I realized that i was using the wrong package name
grunt> v = LOAD 'pig/cricket' USING org.elasticsearch.pig.EsStorage; 2014-05-14 15:56:48,873 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.elasticsearch.pig.EsStorage using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Details at logfile: /root/pig_1400106825043.log