HTTPS SSH

Theme Streams

Requirements

Note:

  • By default, the modified version of ES Twitter River plugin uses port 9400 for websocket connection. See the source
  • If you want to modify the plugins, you need to re-compile the package and re-install the plugin on your elasticsearch server.

Installation

Please follow these step-by-step instructions:

  1. Install Elasticsearch

    See the installation instructions

  2. Install Elasticsearch Transport Websocket Plugin

    $ ./plugin --install websocket --url file:elasticsearch-transport-websocket-1.3.1.0-plugin.zip

  3. Install Elasticsearch Twitter River Plugin

    $ ./plugin --install river-twitter --url file:elasticsearch-river-twitter-3.0.0-SNAPSHOT.zip

  4. Install Themestreams Site

    • Go to your elasticsearch directory
    • Download themestreams from the repository and extract the package
    • Go to the plugin folder
    • Create a new folder named themestreams
    • Go to that directory and create symbolic link _site pointed to the themestreams source code

    $ ln -s themestreams_source_code_path/ThemeStreams/ _site

  5. Configure Elasticsearch

    Add these lines to your elasticsearch.yml

    http.port: 8004 # specify http port
    websocket.enabled: true
    websocket.onsiteonly: false
    websocket.port: 8005 # specify websocket port
    river.twitter.oauth.consumer_key: "your_consumer_key"
    river.twitter.oauth.consumer_secret: "your_consumer_secret"
    river.twitter.oauth.access_token: "your_access_token"
    river.twitter.oauth.access_token_secret: "your_access_token_secret"
    

    Specify the http and websocket ports that will be used.

  6. Run Elasticsearch

  7. Create Index, and Analyzer

    In this following examples, we are using index name tweets_v2 and base url http://zookst20.science.uva.nl:8004/.

    PUT http://zookst20.science.uva.nl:8004/tweets_v2
    {
      "settings" : {
        "analysis" : {
          "filter" : {
            "tweet_filter" : {
              "type" : "word_delimiter",
              "type_table": ["# => ALPHA", "@ => ALPHA"]
            }   
          },
          "analyzer" : {
            "tweet_analyzer" : {
              "type" : "custom",
              "tokenizer" : "whitespace",
              "filter" : ["lowercase", "tweet_filter"]
            }
          }
        }
      },
      "mappings" : {
        "status" : {
          "properties" : {
            "text" : {
              "type" : "string",
              "analyzer" : "tweet_analyzer"
            }
          }
        }
      }
    }
    
  8. Create Default Percolator Named themestreams

    PUT http://zookst20.science.uva.nl:8004/tweets_v2/.percolator/themestreams
    {
      "query" : {
        "match_all" : {}
      }
    }
    
  9. Set Default ttl for Other Percolators

    POST http://zookst20.science.uva.nl:8004/tweets_v2/.percolator/_mapping
    {
      ".percolator" : {
        "_ttl" : { "enabled" : true, "default" : "1d" }
      }
    }
    
  10. Create Twitter River

    PUT http://zookst20.science.uva.nl:8004/_river/tweets_v2/_meta
    {
      "type": "twitter",
      "twitter": {
        "filter": {
          "user_lists": "themestreams/politician,themestreams/lobbyist,themestreams/journalist,themestreams/other"
        }
      }
    }
    

    Here we create a new twitter river into the existing index tweets_v2 and specify twitter lists that we are going to index. In this example, themestreams is the twitter username, and politician, lobbyist, journalist, other are names of the lists.