Wiki

Clone wiki

BibSonomy / development / modules / sharedresourcesearch / SharedResourceSearch

Preface

Shared Resource Search a.k.a. federated search is implemented using ElasticSearch which is an open source, distributed, Schema Less and RESTful search engine based on Lucene.

Installing ElasticSearch

Although other setups are possible for bibsonomy, the recommended way is to install an elasticsearch node locally on every machine hosting a bibsonomy installation.

Firewall

As there is no authentication+authorization functionality integrated in elasticsearch, the firewall should block by default and only allow communication with known hosts on the following ports:

  • Ports 9300 tcp und udp are required in both directions for inter-node communication.
  • Port 9200 is used by the webapp to talk to the node (if the node is on the local machine, no extra firewall rules are required)

Java Version

Elasticsearch requires Java 7 >= 1.7_55. Before installing Elasticsearch, please check your Java version and install/upgrade accordingly if needed. In ubuntu, you can update java 7 this way:

aptitude update
aptitude install oracle-java7-installer
aptitude install oracle-java7-set-default

don't forget to change the JAVA_HOME environment variable if it is set. It could be set in /etc/bash.bashrc

Elasticsearch

simply download the .deb package version 1.2.4 and install via dpkg -i bla.deb. Then do what you are asked to do.

install manually

Download the version 1.2.4 (other version might have compatibility issues with the libraries) from elasticsearch website. After downloading first unzip the file and then go into the bin directory as follows:

cd elasticsearch-1.2.4/bin

And now to start our node and single cluster (Windows users should run the elasticsearch.bat file):

./elasticsearch

For more detail you can visit the installation guide in elasticsearch page.

Running ElasticSearch as service

To run elasticsearch as a service on a Linux system, there are debian and rpm packacges available on the download page above. Please follow this link and for a Windows system please follow this link.

Configure ElasticSearch server

To change server configurations, edit /etc/elasticsearch/elasticsearch.yml if installed via the debian package or go to the installation directory and edit config/elasticsearch.yml.

Change relevant settings for your environment:

  • cluster name for your cluster (cluster.name: your_cluster_name). This should be the same on all nodes
  • node.name: "speakingHosNameOrSo" This should be unique
  • discovery.zen.ping.multicast.enabled: false
  • discovery.zen.ping.unicast.hosts: ["node1host:9300", "node2host:9300", ...]
  • bootstrap.mlockall: true
  • maybe: path.data: /mnt/somwhere/somepath

Restart server for changes to take place and browse using head plugin to see the changes.

You may want to visit the configuration page for all the configuration options.

JVM memory settings

Elasticsearch should be given more memory than it gets by default. You can do this in /etc/init.d/elasticsearch by uncommenting and setting:

ES_HEAP_SIZE=6g

Managing the cluster and index from the command line

you can check the cluster health via

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'

you can delete ALL indices from a cluster using

curl -XDELETE 'http://localhost:9200/_all'

Other commands:

#Create Index
$ curl -XPUT 'http://localhost:9200/twitter/'

# show available indices and their alias names
curl http://localhost:9200/_aliases?pretty=1

# show information about an index
curl http://localhost:9200/<indexname>/_stats?pretty=1

#Add document
$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
  "tweet" : {
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elastic Search"
   }
  }'

#Get document by id
$ curl -XGET 'http://localhost:9200/twitter/tweet/1'

#Search document
$ curl -XGET 'http://localhost:9200/twitter/tweet/_search?q=user:kimchy'

Generating Index

To generate a shared resource index please use the admin interface of BibSonomy

You can also check the state of the BibSonomy system using

curl -XGET 'localhost:9200/posts/_search?q=_type:SystemInformation&size=5&pretty=true'

(Optionally) install plugins

web-frontend

Elasticsearch-head is a web front end plugin for browsing and interacting with an elasticsearch cluster.

If elasticsearch has been installed via the .deb-package you should do: cd /usr/lib/jvm/java-7-oracle ./bin/plugin -install mobz/elasticsearch-head

To browse the installed plugin go to, http://localhost:9200/_plugin/head/

other plugins

Refer to Plugin Guide for detailed list of available plugins.

Updated