Appearance-based methods for visual localisation

This is a MATLAB library to extract visual descriptors and implement a bag-of- visual-words pipeline from video sequences taken by multiple users in order to provide localisation.

The code is customised and ready to be used with the RSM dataset ( but can be used on any sort of image sequences if the directory paths are correctly specified.

Current implemented descriptor extraction methods (description below): LW_COLOR, SIFT, DSIFT, SF_GABOR, ST_GABOR, ST_GAUSS

Current supported format of the sequences: jpg


Date: v4.1 11/2015


SIFT, DSIFT, VLAD and kernel implementations require VLFEAT Clustering requires INRIA's Yael K-means

Running Instructions:

Rename initialize.m.template to initialize.m

cp initialize.m.template initialize.m

Run main.m


Detailed Instructions:

Parameter selection

  • Parameter selection.

Select your choice from the following parameters in the params structure before continuing:

params = struct(...
    'descriptor',    'ST_GAUSS',...  % SIFT, DSIFT, SF_GABOR, ST_GABOR, ST_GAUSS,
    'corridors',     1:6,... % Corridors to run [1:6] (RSM v6.0)
    'passes',        1:10,... % Passes to run [1:10] (RSM v6.0)
    'trainingSet',   [1:3,5], ... 
    'datasetDir',    '/data/datasets/RSM/visual_paths/v6.0',...   % The root path of the RSM dataset
    'frameDir',      'frames_resized_w208p',... % Folder name where all the frames have been extracted.
    'descrDir',  ...
    '/data/datasets/RSM/descriptors', ...
    'dictionarySize', 400, ...
    'dictPath',       '/data/datasets/RSM/dictionaries', ...
    'encoding', 'HA', ... % 'HA', 'VLAD', 'LLC'
    'kernel', 'chi2', ... % 'chi2', 'Hellinger'
    'kernelPath', '/data/datasets/RSM/kernels', ...
    'metric', 'max', ...
    'groundTruthPath', './ground_truth', ...
    'debug', 1 ... % 1 shows waitbars, 0 does not.

These parameters are the following

  • datasetDir: The root path of the RSM dataset
  • corridors: Corridors to run [1:6] (RSM v6.0)
  • passes: Passes to run [1:10] (RSM v6.0)
  • trainingSet: training set to use for dictionary construction
  • frameDir: Folder name where all the frames have been extracted.
  • descrDir
  • descriptor: Type of descriptors to be calculated. To choose from
    • LW_COLOR: Lightweight spatio-temporal colour descriptor
    • SIFT: keypoint based SIFT descriptors
    • DSIFT: Dense SIFT
    • SF_GABOR: Frame-based DAISY-like descriptors
    • ST_GABOR: Spatio-temporal Gabors.
    • ST_GAUSS: Spatio-temporal, Spatial Derivative, Temporal Gaussian
  • dictionarySize: number of visual words (parameter k in k-means)
  • dictPath: directory where to store the created dictionaries.
  • encoding: encoding method
  • kernel:
  • kernelPath:
  • metric:
  • groundTruthPath:
  • debug:

Descriptor generation


Bag of Words pipeline

  • create_dictionaries (k-means vector quantization)

clusterDescriptorsSparse (for Keypoint-SIFT)
  • BOVW encoding (Hard assigment, VLAD, or LLC)

  • Kernels for histograms

  • Run evaluation routine to add the error measurement to the kernels.
  • Generate PDF results and plots with