Wiki

Clone wiki

gnd / SchemaValidationStrategy

Introduction

To ease our challenge of getting the data out, I'm keen that we groom/verify the data on the way in. We can do this in two ways, by checking the correct fields are present (as covered in the sections further down), and also by verifying that fields are of the correct type.

There is the concept of JSON schemas: http://json-schema.org/ This is a way of defining a schema just using json.

Note: there is a Javascript-based JSON schema validator at https://github.com/garycourt/JSV It may be possible to integrate this into a CouchDb validate function.

Type validation

A prime instance for type validation is to ensure that Date fields adhere to the relevant ISO date standard. We may also wish to type that angular values are 0..360 (including position), or that type-attributes are of an expected type.

Top Level

We will probably want to do some kind of top-level validation to ensure that core/essential/common fields are present. These may include:

  • schema/document type
  • created date
  • name
  • platform
  • sensor
  • inserted by

There may also be optional top-level elements that describe how data is derived from, or contributes to other documents:

  • derived-from
  • fed-into

Plus there may be version-history - and description of why revisions are made.

Aah, as the 'may' in the above paras highlights - these are optional, so we can't check for their existence. We could validate them if they're present, though we can't check that the subject documents are present.

Discrete

Some attributes won't be needed for all types, these could be broken down into geospatial elements, and 'others'.

GeoSpatial

  • x/y coordinate pairs
  • bounding-box

Others

  • media reference
  • array of time-stamps
  • start-time/finish time
  • array of data observations (non spatial)
  • metadata for what's being looked at: the id of what we're looking at, any other environmental characteristics

Updated