Wiki

Clone wiki

MGS Canopy Algorithm / Input and output

##Input

###Point file

The program expects as an input a file containing points and their profiles.

  • points must be specified in a line-by-line fashion
  • the first column must always be point's name
  • second and following columns must be point's dimension-positions (profiles)
  • all column are whitespace delimited (tabs or spaces)

In addition: * input file cannot have any kind of header * all points must have equal amount of data points (same profile lengths)

Example:

Point0 0 0 0 1 3 3
Point1 1 3 0 1 9 6
Point2 0 6 0 1 0 3
Point3 0 3 0 2 2 1
Point4 0 2 0 1 1 7

##Output

###Cluster file

The cluster file contains, line by line, tab separated pairs of values <cluster name> <point name>. Clusters are sorted according to their size from the biggest to the smallest. Points are not sorted in any particular order.

Example:

MGU00000        Point0
MGU00000        Point3
MGU00000        Point4
MGU00000        Point5
MGU00000        Point7
MGU00001        Point0
MGU00001        Point3
MGU00001        Point8
MGU00001        Point9
MGU00002        Point13
MGU00002        Point12
MGU00002        Point3
MGU00003        Point1
MGU00003        Point5
MGU00004        Point5
MGU00004        Point6
MGU00005        Point8

###Cluster profiles file

Cluster profiles file contains line-by-line the cluster name and it's profile. The first column is always the name and following columns are the profile data-points in the order corresponding to point input file. All columns are tab separated.

Example:

MGU00000        1 0 0 3 3 3
MGU00001        0 8 0 1 2 1
MGU00002        0 3 0 1 3 3
MGU00003        0 2 1 1 6 3
MGU00004        9 0 0 1 3 0
MGU00005        8 0 0 1 3 0

Updated