- Download version of spark that is appropriate for your hadoop from Spark Download page. In my case i am using Cloudera CHD4 VM image for development so i did download CDH4 version
- I did extract the spark-1.0.0-bin-cdh4.tgz in /home/cloudera/software folder
-
Next step is to build a WordCount.py program like this. This program has 3 methods in this
- flatMap: This method takes a line as input and splits it on space and publishes those words
- map: This method takes a word as input and publishesh a tuple in
word, 1
format - reduce: This method takes care of adding all the counters together
counts = distFile.flatMap(flatMap).map(map).reduceByKey(reduce)
takes care of tying everything together -
Once WordCount.py is ready you can execute it like this by providing it path of the WordCount.py and input and output path
./bin/spark-submit --master local[4] /home/cloudera/workspace/spark/HelloSpark/WordCount.py file:///home/cloudera/sorttext.txt file:///home/cloudera/output/wordcount
-
Once the program is done executing you can take a look at the output by executing following command
more /home/cloudera/output/wordcount/part-00000
WordCount program writtten using Spark framework written in python language
In the WordCount(HelloWorld) MapReduce program entry i talked about how to build a simple WordCount program using MapReduce. I wanted to try developing same program using Apache Spark but using Python, so i followed these steps
Hi There,
ReplyDeleteWhen I use your example without the code specifying the output file, the output can be printed into terminal. But when I added the output address, there is no output, and terminal has a response: "Usage: wordcount ".
Can you help me with this?
This article is so useful for users. Thanks for sharing this news with us !
ReplyDeleteWord Count Software
Hey, Great article! I liked the way you write, Check my articles . You may like itInterior Renovation Ideas on your Budget: 5 MINIMALIST INTERIOR DESIGN IDEAS 11 Ultimate tips for Kitchen Interior DesigningUseful ideas for Apartment home Interior designs:
ReplyDeleteI wish more authors of this type of content would take the time you did to research and write so well. I am very impressed with your vision and insight. this
ReplyDeleteThank you ffor this
ReplyDelete