Sunil's Notes: Flume to Spark Streaming - Pull model

Flume to Spark Streaming - Pull model

In this post i will demonstrate how to stream data from flume into Spark using Streaming. When it comes to Streaming data from Flume to Spark you have 2 options.

Push Model: Spark listens on particular port for Avro event and flume connects to that port and publishes event
Pull Model: You use special Spark Sink in flume that keeps collecting published data and Spark pulls that data at certain frequency

I built this simple configuration in which i could send event to flume on netcat, flume would take those events and send them to Spark as well as print to console.

First download spark-streaming-flume-sink_2.10-1.6.0.jar and copy it to flume/lib directory
Next create flume configuration that looks like this, as you can see, Flume is listening for netcat event on port 44444 and it is taking every event and replicating it to both logger and Spark sink. Spark sink would listen on port 9999 for Spark program to connect
This is how your Spark driver will look like. The Spark Flume listener gets event in avro format so you will have to call event.getBody().array() to get the event.



Once your spark and flume agents are started open netcat on port 44444 and send messages, those messages should appear in your Spark Console























Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




Labels:
flume,
spark,
sparkstreaming




6 comments:







Unknown
said...



Hi i am trying to integrate spark with flume. can u plz make a video or elaborate the post step by step. thank u so much





March 1, 2016 at 10:38 PM













Anonymous
said...



Can you please provide the same concept using java? And also make video





June 10, 2016 at 10:08 PM













Anonymous
said...



Hi Sunil

Can you let me know what all dependencies are required for the pom.xml file. Also how to integrate it with sbt build tool in eclipse.
Thanks





September 14, 2016 at 4:11 AM













Anonymous
said...



Hi Sunil,
Can you please specify the command you used for deploying spark. which are the host and port numbers used for the same?





October 25, 2016 at 3:30 AM













Anonymous
said...



Hi,

I'm trying to implement it, but my problem is that my Flume memory channel is filled, but events are not consumed by the sink. My configuration seems good. Did you already have this kind of problem ?

Thank in advance

Hélène





February 17, 2017 at 6:56 AM













Anonymous
said...



dislike





June 8, 2018 at 3:14 PM











Post a Comment




Newer Post


Older Post

Home




Subscribe to:
Post Comments (Atom)












About Me





Sunil Patil


I am Big Data Solution Architect working for MapR Technologies. 

I am also author of few books and articles

View my complete profile



Total Pageviews










Popular Posts



Spark program to read data from RDBMS


Using AWStats to generate portal usage report


Using AngularJs in Worklight/PhoneGap application


How to drain/delete/expire existing messages in Kafka


Configure Portlet Load monitoring for maximum number of concurrent request





Blog Archive








        ► 
      



2017

(7)





        ► 
      



November

(3)







        ► 
      



January

(4)









        ▼ 
      



2016

(27)





        ► 
      



December

(6)







        ► 
      



November

(2)







        ► 
      



July

(4)







        ► 
      



April

(2)







        ► 
      



February

(4)







        ▼ 
      



January

(9)

Reading content of file into String in scala
Hello Apache Tika
Flume to Spark Streaming - Pull model
Setting up local repository for maven
Monitoring HDFS directory for new files using Spar...
Problem with scala version mismatch in Spark appli...
Spark error class "javax.servlet.FilterRegistratio...
How to use Hadoop's InputFormat and OutputFormat i...
How to use ZooInsepector










        ► 
      



2015

(15)





        ► 
      



December

(6)







        ► 
      



November

(1)







        ► 
      



September

(1)







        ► 
      



August

(2)







        ► 
      



May

(3)







        ► 
      



April

(1)







        ► 
      



January

(1)









        ► 
      



2014

(67)





        ► 
      



December

(8)







        ► 
      



August

(10)







        ► 
      



July

(15)







        ► 
      



June

(8)







        ► 
      



May

(6)







        ► 
      



April

(9)







        ► 
      



March

(1)







        ► 
      



February

(10)









        ► 
      



2013

(11)





        ► 
      



December

(2)







        ► 
      



November

(6)







        ► 
      



October

(1)







        ► 
      



April

(2)









        ► 
      



2012

(93)





        ► 
      



November

(4)







        ► 
      



July

(3)







        ► 
      



June

(19)







        ► 
      



May

(18)







        ► 
      



April

(12)







        ► 
      



March

(33)







        ► 
      



January

(4)









        ► 
      



2011

(64)





        ► 
      



December

(3)







        ► 
      



November

(1)







        ► 
      



October

(4)







        ► 
      



July

(1)







        ► 
      



June

(1)







        ► 
      



May

(8)







        ► 
      



April

(15)







        ► 
      



March

(7)







        ► 
      



February

(19)







        ► 
      



January

(5)









        ► 
      



2010

(271)





        ► 
      



December

(12)







        ► 
      



November

(1)







        ► 
      



October

(10)







        ► 
      



September

(51)







        ► 
      



August

(25)







        ► 
      



July

(38)







        ► 
      



June

(43)







        ► 
      



May

(27)







        ► 
      



April

(28)







        ► 
      



March

(9)







        ► 
      



February

(17)







        ► 
      



January

(10)









        ► 
      



2009

(535)





        ► 
      



December

(19)







        ► 
      



November

(1)







        ► 
      



September

(174)







        ► 
      



August

(50)







        ► 
      



July

(35)







        ► 
      



June

(33)







        ► 
      



May

(64)







        ► 
      



April

(59)







        ► 
      



March

(94)







        ► 
      



February

(5)







        ► 
      



January

(1)









        ► 
      



2008

(55)





        ► 
      



December

(6)







        ► 
      



November

(28)







        ► 
      



October

(21)









Twitter

Tweets by @pppsunil
            





Enter your email address:














Followers



























Watermark theme. Powered by Blogger.