Wiki

Clone wiki

gnd / DocumentMetadata

General

This is a collection of meta tags covering things like:

  • platform name (e.g. Vehicle Reg:233 ASD11)
  • platform type [a] (e.g. VW Golf)
  • sensor name (e.g. TomTom 3400 )
  • sensor type [a] (e.g. GPS)
  • start_time
  • end_time
  • derived_from (ids of documents used in production of this one)
  • contributes_to (ids of documents based on the content of this one)
  • bounding box, as described in the GeoJSON Spec

Dublin Core

The following Dublin Core elements seem to be of interest. Notes relating to how they could be used in our context aren't formal requirements that they be used - they're just a brain-dump of thoughts.

  • _dc.contributor: An entity responsible for making contributions to the content of the resource.This can store the ids of documents that this one is derived from
  • _dc.coverage: The extent or scope of the content of the resource. Note, the coverage field can store bounding box and time period.
  • _dc.creator: An entity primarily responsible for making the content of the resource.This could be an organisation, a vehicle, or the recording system (sensor). Sensor seems the most obvious usage. Actually, it could also store the Platform aswell. Sensor would be compulsory, platform optional.
  • _dc.date: A date associated with an event in the life cycle of the resource.This would be the data the document was submitted to the database.
  • _dc.description: An account of the content of the resource.
  • _dc.format: Typically, Format may include the media-type or dimensions of the resource.This could be a mime-type using a standard like application/json. Alternatively it could be a more specific format like application/vnd.gnd. Thus we could also store application/vnd.GeoJSON if we wanted to store raw GeoJSON for some reason. On reflection this may be the ideal placeholder for the schema-type: 2d, 3d, 2d+time, time-variable, etc
  • _dc.identifier: Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
  • _dc.modified: Date on which the resource was changed.
  • _dc.publisher: An entity responsible for making the resource available.This would be where we store the name of the person who submitted the data to the database.
  • _dc.relation: A reference to a related resource. This could prove useful to link datasets. The Qualified Dublin Core allows for other elements to describe the nature of the relationship. But, these additional relationships would be non-standard - ideally we'll stick to core elements. This seems the best placeholder for documents that this one Contributes to.
  • _dc.rights: Information about rights held in and over the resource.This could store the distribution rights of the dataset, or how it should be protected "personal", "public".
  • _dc.source: A reference to a resource from which the present resource is derived. This could store the media reference where the dataset was extracted from, or null for a document generated from other ones
  • _dc.subject: The topic of the content of the resource.Subject could be used to contain keywords
  • _dc.title: A name given to the resource. This could be the CouchDb id. Alternatively it could be a nice human-readable title that includes platform, sensor, part of the id.
  • _dc.type: The nature or genre of the content of the resource. There are a standard set of types, one of which is Dataset. Others are Image, Sound, Software, Event. These could represent other types of data in the datastore.

We may well extract all of this data into a view that is more easily searched.

[a]. Note: maybe we'll have tables of platforms and sensors which know their types. Then, we can derive the supporting data and include it in a view/index.

Updated