Configuring Flume to use Twitter as Source

I wanted to figure out how to Configure Twitter as source for Flume so i tried these steps
  1. First go to Twitter Application Management page and configure application. This should give you consumerKey, consumerSecret, accessToken and accessTokenSecret
  2. Next create twitterflume.properties, that looks like this. You should create source of org.apache.flume.source.twitter.TwitterSource type and use the 4 values you got in the last step to configure access to twitter
    
    agent1.sources = twitter1
    agent1.sinks = logger1
    agent1.channels = memory1
    
    
    agent1.sources.twitter1.type = org.apache.flume.source.twitter.TwitterSource
    agent1.sources.twitter1.consumerKey =<consumerkey>
    agent1.sources.twitter1.consumerSecret =<consumerSecret>
    agent1.sources.twitter1.accessToken =<accessToken>
    agent1.sources.twitter1.accessTokenSecret =<accessTokenSecret>
    agent1.sources.twitter1.keywords = bigdata, hadoop
    agent1.sources.twitter1.maxBatchSize = 10
    agent1.sources.twitter1.maxBatchDurationMillis = 200
    
    
    # Describe the sink
    agent1.sinks.logger1.type = logger
    
    # Use a channel which buffers events in memory
    agent1.channels.memory1.type = memory
    agent1.channels.memory1.capacity = 1000
    agent1.channels.memory1.transactionCapacity = 100
    
    # Bind the source and sink to the channel
    agent1.sources.twitter1.channels = memory1
    agent1.sinks.logger1.channel = memory1
    
  3. Now last step is to run the flume agent and you should see twitter messages being dumped to console bin/flume-ng agent --conf conf --conf-file conf/twitterflume.properties --name agent1 -Dflume.root.logger=DEBUG,console
Note: When i tried this in the Hadoop Sandbox i started getting following authentication error, it seems the problem is that if your VM time is in the past then this causes this issue. Ex. when i did execute the date command on my sandbox i got date which was 3 days in the past. So i did restart the VM and after restart when i tried date command it gave me accurate time and the following error went away

[Twitter Stream consumer-1[Establishing connection]] ERROR   
org.apache.flume.source.twitter.TwitterSource (TwitterSource.java:331) -   
Exception while streaming tweets
stream.twitter.com
Relevant discussions can be found on the Internet at:
    http://www.google.co.jp/search?q=d0031b0b or
    http://www.google.co.jp/search?q=1db75522
TwitterException{exceptionCode=[d0031b0b-1db75522 db667dea-99334ae4],    
statusCode=-1, message=null, code=-1, retryAfter=-1, rateLimitStatus=null,   
version=3.0.3}
    at   
twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:192)
    at   
twitter4j.internal.http.HttpClientWrapper.request(HttpClientWrapper.java:61)
    at   
twitter4j.internal.http.HttpClientWrapper.get(HttpClientWrapper.java:89)
    at  
twitter4j.TwitterStreamImpl.getSampleStream(TwitterStreamImpl.java:176)
    at twitter4j.TwitterStreamImpl$4.getStream(TwitterStreamImpl.java:164)
    at  
   twitter4j.TwitterStreamImpl$TwitterStreamConsumer.run
(TwitterStreamImpl.java:462)
Caused by: java.net.UnknownHostException: stream.twitter.com
    at   
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:637)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
    at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:264)
    at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
    at  
   sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.
getNewHttpClient
(AbstractDelegateHttpsURLConnection.java:191)
    at  sun.net.www.protocol.http.HttpURLConnection.plainConnect
(HttpURLConnection.java:933)
    at  
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect
(AbstractDelegateHttpsURLConnection.java:177)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream
(HttpURLConnection.java:1301)
    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode
(HttpsURLConnectionImpl.java:338)
    at twitter4j.internal.http.HttpResponseImpl.    
(HttpResponseImpl.java:34)
    at  
twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:156)

3 comments:

Darpan said...

I am getting the same error. My dates are fine. Restarted the system also... Does not work. Please help.

黄辰光 said...

I've caught the same problem.And I have not saved it yet.

Riya Jhalani said...

i am facing the same error. can somebody help?