Writing data from Spark to ElasticSearch

ElasticSearch for Apache Hadoop project has introduced a way to directly write to ElasticSearch without going through Elastic Search OutputFormat. I wanted to try that out so i built simple application that saves output of word count into Elastic Search, you can download this project from github First thing that i had to do was to build maven pom.xml that includes org.elasticsearch.elasticsearch-hadoop version 5.0 jar. I could not find it in the regular maven repository so i had to include elasticsearch repository in my pom.xml Then this is how my Spark program looks like, the main part is line 42 where i create Map of all the properties that i need for saving this RDD into ElasticSearch and then line 43, where i am calling wordCountJson.saveToEs(esMap), which actually takes care of writing data into elasticsearch

No comments:

Post a Comment