Wiki
Clone wikignd / High_Level_Backend_Architecture
Introduction
After consideration of a RDBMS (PostgreSQL), and an object-store (OrientDb), we've ended up using CouchDb. This database offers very fast document submission and retrieval, it uses an http REST-API that we're finding easy to integrate with, and it has good quality documentation.
But, CouchDb falls short in faceted search. so we're plugging in ElasticSearch to fill this gap. Hey, now it's a Search Oriented Architecture.
Search Oriented Architecture
As you'll see from the following diagram, we've switched the backend processing into two domains: CouchDb and ElasticSearch. This means that the indexing is separated from file submission/retrieval. The removal of indexing from the database means we've saved a some millis off every submit operation. These aren't great but they can have a cumulative effect. So, when we're submitting lots of documents, very little indexing happens (we keep the CouchDb views to a minimum). But ElasticSearch is listening out for _changes in the database. As CouchDb is storing new documents it announces the changes on this stream. ElasticSearch will then index these new documents, according to processor availability.
Mature REST API
There's a ticket covering the development of a mature REST API.
In with the development of the mature API is the wish to present the two backend servers as a single system. I currently (Apr 2012) believe that CouchDb externals is the best route to doing this.
Ticket #51 covers the definition of the search API, and #52 covers producing the Java helpers.
Search API
The design of the search API will be modelled on the Google REST API. http://GND/tracks/_design/search/
Successive search components in the URL will be separated by the "/" delimiter
And/Or
- The pipe "|" operator will be used to represent OR terms, encoded as %7C
- The addition "+" will be used to represent AND terms
Free text
If only one component is expressed in the URI it will form a free text search, though note that AND/OR operators can be present in this component
- search/Jack - free text search for the term word Jack
- search/Jack%7CJill - free text for the terms Jack or Jill
Category for terms
Where categorised search is performed, the search will be expressed as components:
- search/platform=alpha%7Cbravo - search for document with a platform of alpha or bravo
- search/platform=alpha%7Cbravo/sensor=delta - search for document with a platform of alpha or bravo, AND with a sensor of delta
The following categories will be allowed:
- Platform: platform
- Platform-Type: platform-type
- Sensor: sensor
- Sensor-Type sensor-type
- Trial: trial
- Type: type
- Name: name
- Data-type: data-type
- Geo-Bounds: geo-bounds
- Time-Bounds/Start: time-start
- Time-Bounds/End: time-end
Updated