Wiki

Clone wiki

gnd / High_Level_Backend_Architecture

Introduction

After consideration of a RDBMS (PostgreSQL), and an object-store (OrientDb), we've ended up using CouchDb. This database offers very fast document submission and retrieval, it uses an http REST-API that we're finding easy to integrate with, and it has good quality documentation.

But, CouchDb falls short in faceted search. so we're plugging in ElasticSearch to fill this gap. Hey, now it's a Search Oriented Architecture.

Search Oriented Architecture

As you'll see from the following diagram, we've switched the backend processing into two domains: CouchDb and ElasticSearch. This means that the indexing is separated from file submission/retrieval. The removal of indexing from the database means we've saved a some millis off every submit operation. These aren't great but they can have a cumulative effect. So, when we're submitting lots of documents, very little indexing happens (we keep the CouchDb views to a minimum). But ElasticSearch is listening out for _changes in the database. As CouchDb is storing new documents it announces the changes on this stream. ElasticSearch will then index these new documents, according to processor availability.

Search Oriented Architecture

Mature REST API

There's a ticket covering the development of a mature REST API.

In with the development of the mature API is the wish to present the two backend servers as a single system. I currently (Apr 2012) believe that CouchDb externals is the best route to doing this.

Ticket #51 covers the definition of the search API, and #52 covers producing the Java helpers.

Search API

The design of the search API will be modelled on the Google REST API. http://GND/tracks/_design/search/

Successive search components in the URL will be separated by the "/" delimiter

And/Or

  • The pipe "|" operator will be used to represent OR terms, encoded as %7C
  • The addition "+" will be used to represent AND terms

Free text

If only one component is expressed in the URI it will form a free text search, though note that AND/OR operators can be present in this component

  • search/Jack - free text search for the term word Jack
  • search/Jack%7CJill - free text for the terms Jack or Jill

Category for terms

Where categorised search is performed, the search will be expressed as components:

  • search/platform=alpha%7Cbravo - search for document with a platform of alpha or bravo
  • search/platform=alpha%7Cbravo/sensor=delta - search for document with a platform of alpha or bravo, AND with a sensor of delta

The following categories will be allowed:

  • Platform: platform
  • Platform-Type: platform-type
  • Sensor: sensor
  • Sensor-Type sensor-type
  • Trial: trial
  • Type: type
  • Name: name
  • Data-type: data-type
  • Geo-Bounds: geo-bounds
  • Time-Bounds/Start: time-start
  • Time-Bounds/End: time-end

Updated