Clone wiki

MGS Canopy Algorithm / Home


This wiki is intended to introduce you to understading, obtaining and running our implementation of canopy clustering algorithm.

We would like to hear your opinion. If you have any comments or questions, please contact Bjorn or Piotr.

Canopy clustering introduction

Our variation of canopy clustering focuses on efficient clustering of points in multi-dimensional pearson correlation space. The basic notion of the heuristic is choosing a point(seed point) at random, and upon establishing it's distance to all the other points, single out those that are within a specified canopy distance. A median profile of those points is then calculated creating a canopy centroid. The canopy creation process is then repeated perpetually(canopy walk) from the previously created centroid until the distance between previous centroid is small enough. All points from the last canopy are then marked, and will not become seed points again. The above will be repeated for all possible seed points.


  1. Requirements
  2. How to obtain the executable
  3. Program parameters
  4. Input and output
  5. Example run