Wiki

Clone wiki

gnd / AnalysisRequirements

Introduction

Here are a collection of thoughts related to the bigger picture ecosystem.

TRF

Here are some thoughts taken from the TRF documents of the early Noughties:

  • Gap analysis: on occasion there are large gaps in a dataset. These are only recognised at the end of the data pipeline once the data has been exported and is being analysed. It would be better to highlight gaps at the export stage, possibly identifying them at the import stage. The system should probably handle these problems by splitting the document into two or more documents.
  • Online, up to date guidance. Where peripheral algorithms are utilised, developed, users would like guidance integrated into the app. If the algorithms are implemented in ETL then a tiddlywiki-style system may suffice. If we implement them in Deb then they should go into the Deb user manual.
  • One-stop shop. The system should not just store the observation data, it should store data related to managing that data. Coverages, exercise dates, track the evolution of a dataset (corrections).
  • Version control of algorithms. Algorithms may have to evolve. This should be version controlled. So ETL scripts could be in VC. Deb algs already under VC. DVCS may be an option to let users have local changes that are then migrated to master.
  • Version control of data. Corrections may be made to a dataset, and we should store prior versions as data is rectified/smoothed/corrected.
  • Pedigree of data. When this happens it is proposed that two-way linkages be implemented to ease tracking. Where did a file come from? What other data has been generated based on this dodgy file? This linkage could also include a verb detailing how a dataset has been transformed (interpolated, combined, filtered, trimmed, decimated, smoothed, rectified)
  • Track Operations: these things may need to be done to a track. Marge tracks from multiple sources into a composite track. Remove or apply drift. Remove jumps, Resample to regular time steps, move track, bend track
  • Earth Model. Positional data may come from different earth models. Either allow transformation between models, or always use one (WGS-84).

Updated