ElasticSearch for Apache Hadoop project has introduced a way to directly write to ElasticSearch without going through
Elastic Search OutputFormat. I wanted to try that out so i built simple application that saves output of word count into Elastic Search, you can download this project from
github
First thing that i had to do was to build maven pom.xml that includes
org.elasticsearch.elasticsearch-hadoop
version 5.0 jar. I could not find it in the regular maven repository so i had to include elasticsearch repository in my pom.xml
Then this is how my Spark program looks like, the main part is line 42 where i create Map of all the properties that i need for saving this RDD into ElasticSearch and then line 43, where i am calling
wordCountJson.saveToEs(esMap)
, which actually takes care of writing data into elasticsearch
No comments:
Post a Comment