Configuring Flume to use Twitter as Source

I wanted to figure out how to Configure Twitter as source for Flume so i tried these steps
  1. First go to Twitter Application Management page and configure application. This should give you consumerKey, consumerSecret, accessToken and accessTokenSecret
  2. Next create twitterflume.properties, that looks like this. You should create source of org.apache.flume.source.twitter.TwitterSource type and use the 4 values you got in the last step to configure access to twitter
    
    agent1.sources = twitter1
    agent1.sinks = logger1
    agent1.channels = memory1
    
    
    agent1.sources.twitter1.type = org.apache.flume.source.twitter.TwitterSource
    agent1.sources.twitter1.consumerKey =<consumerkey>
    agent1.sources.twitter1.consumerSecret =<consumerSecret>
    agent1.sources.twitter1.accessToken =<accessToken>
    agent1.sources.twitter1.accessTokenSecret =<accessTokenSecret>
    agent1.sources.twitter1.keywords = bigdata, hadoop
    agent1.sources.twitter1.maxBatchSize = 10
    agent1.sources.twitter1.maxBatchDurationMillis = 200
    
    
    # Describe the sink
    agent1.sinks.logger1.type = logger
    
    # Use a channel which buffers events in memory
    agent1.channels.memory1.type = memory
    agent1.channels.memory1.capacity = 1000
    agent1.channels.memory1.transactionCapacity = 100
    
    # Bind the source and sink to the channel
    agent1.sources.twitter1.channels = memory1
    agent1.sinks.logger1.channel = memory1
    
  3. Now last step is to run the flume agent and you should see twitter messages being dumped to console bin/flume-ng agent --conf conf --conf-file conf/twitterflume.properties --name agent1 -Dflume.root.logger=DEBUG,console
Note: When i tried this in the Hadoop Sandbox i started getting following authentication error, it seems the problem is that if your VM time is in the past then this causes this issue. Ex. when i did execute the date command on my sandbox i got date which was 3 days in the past. So i did restart the VM and after restart when i tried date command it gave me accurate time and the following error went away

[Twitter Stream consumer-1[Establishing connection]] ERROR   
org.apache.flume.source.twitter.TwitterSource (TwitterSource.java:331) -   
Exception while streaming tweets
stream.twitter.com
Relevant discussions can be found on the Internet at:
    http://www.google.co.jp/search?q=d0031b0b or
    http://www.google.co.jp/search?q=1db75522
TwitterException{exceptionCode=[d0031b0b-1db75522 db667dea-99334ae4],    
statusCode=-1, message=null, code=-1, retryAfter=-1, rateLimitStatus=null,   
version=3.0.3}
    at   
twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:192)
    at   
twitter4j.internal.http.HttpClientWrapper.request(HttpClientWrapper.java:61)
    at   
twitter4j.internal.http.HttpClientWrapper.get(HttpClientWrapper.java:89)
    at  
twitter4j.TwitterStreamImpl.getSampleStream(TwitterStreamImpl.java:176)
    at twitter4j.TwitterStreamImpl$4.getStream(TwitterStreamImpl.java:164)
    at  
   twitter4j.TwitterStreamImpl$TwitterStreamConsumer.run
(TwitterStreamImpl.java:462)
Caused by: java.net.UnknownHostException: stream.twitter.com
    at   
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:637)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
    at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:264)
    at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
    at  
   sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.
getNewHttpClient
(AbstractDelegateHttpsURLConnection.java:191)
    at  sun.net.www.protocol.http.HttpURLConnection.plainConnect
(HttpURLConnection.java:933)
    at  
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect
(AbstractDelegateHttpsURLConnection.java:177)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream
(HttpURLConnection.java:1301)
    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode
(HttpsURLConnectionImpl.java:338)
    at twitter4j.internal.http.HttpResponseImpl.    
(HttpResponseImpl.java:34)
    at  
twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:156)

9 comments:

Darpan said...

I am getting the same error. My dates are fine. Restarted the system also... Does not work. Please help.

黄辰光 said...

I've caught the same problem.And I have not saved it yet.

Riya Jhalani said...

i am facing the same error. can somebody help?

Paras Wadhwa said...

just perform

hduser@ubuntu64server:~/apache-flume-1.6.0-bin/conf$ nslookup stream.twitter.com
;; connection timed out; no servers could be reached


This is what causing the issue.

sanam khan said...

Well this is exactly what thousands of people just like you are doing right now. They are called ‘Social Media Managers’ and they are getting paid great money to work online doing simple tasks on Twitter such as:

Sending out ‘tweets’ for businesses
Following customers and other related businesses
Uploading photos and videos to Twitter
Generating more followers
Most of these jobs don’t require any prior experience and the only skills you need are a good knowledge of how to use Twitter. Click the link below to get started:

https://clicktrix.com?domnik4147

Harish enlights said...

As you know, businesses of all sizes right from McDonald’s and Coca-Cola down to your local hardware store are trying to get a presence on social media sites such as Facebook and Twitter. Think of how many ‘Fan pages’ and advertisements you have seen on Facebook recently for businesses in your local area.

It’s a big thing right now and it’s making people just like you a lot of money.

https://clicktrix.com?david6258

srjwebsolutions said...


We are leading responsive website designing and development company in Noida.
We are offering mobile friendly responsive website designing, website development, e-commerce website, seo service and sem services in Noida.

Responsive Website Designing Company in Noida
Website Designing Company in Noida
SEO Services in Noida
SMO Services in Noida

Vikas Chaudhary said...

Battery Mantra is Authorized exide car battery dealer in Noida and Greater Noida. We are providing our service in Indirapuram, Delhi, Ashok Nagar.

Exide Battery Dealer in Noida
Battery Dealer in Noida
Authorized Battery Dealer in Noida
Car Battery Dealer in Noida
Car Battery Dealer
Exide Battery Dealer

EG MEDI said...

Egmedi.com is online medical store pharmacy in laxmi nagar Delhi. You can Order prescription/OTC medicines online. Cash on Delivery available. Free Home Delivery


Online Pharmacy in Delhi
Buy Online medicine in Delhi
Online Pharmacy in laxmi nagar
Buy Online medicine in laxmi nagar
Onine Medical Store in Delhi
Online Medical store in laxmi nagar
Online medicine store in delhi
online medicine store in laxmi nagar
Purchase Medicine Online
Online Pharmacy India
Online Medical Store