# Munzekonza Random Forest

Munzekonza is a library developed for various computer vision tasks at the Computer Vision Laboratory of ETH Zurich. Notably, it was used to perform experiments we published in:

Ristin M., Gall J., Guillaumin M., and van Gool L., From Categories to Subcateories: Large-scale Image Classification with Partial Class Label Refinement, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 15), 2015.

Ristin M., Guillaumin M., Gall J., and van Gool L., Incremental Learning of NCM Forests for Large-Scale Image Classification, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 14), 2014.

The code released here includes only the batch variant of the NCM forest. We provide a Matlab wrapper around the C++ implementation to facilitate the importing/exporting of the data. Moreover, we decided to separate the individual steps (such as training, collection of leaf statistics etc.) so that the forests can be easily parallelized on the cluster and re-used for tasks other than classification without much effort. Please mind that the run-times achieved with this implementation differ from the original implementation used to perform the experiments published in the papers mentioned above (the original implementation was written purely in C++ including optimizations).

Please consider referencing the corresponding papers when using this software.

## Installation

You need to clone the repository somewhere to your local hard disk:

git clone https://markoristin@bitbucket.org/markoristin/munzekonza_random_forest.git


We will assume that the repository is cloned to /home/mristin/projects/munzekonza_random_forest (please adjust to your system). This directory will be referred to as $REPO. ### Requirements Munzekonza framework depends on boost 1.55 (though newer versions might probably work as well) and Eigen3. Furthermore, we developed against Matlab 8.3r2014a-fg and used gcc 4.7.2 for compiling. ### Pre-compiled Binaries We provide pre-compiled binaries for 64bit Linux which should work in most cases, so that you do not have to bother with compiling them yourself. They can be found under the$REPO as munzekonza_random_forest.64bit.2016-01-22.zip. Unzip the archive with:

unzip munzekonza_random_forest.64bit.2016-01-22.zip


The archive contains the folder munzekonza_random_forest_build/. We will refer to its absolute path as BUILD in the further text. For example, if you extracted the archive to /scratch/mristin/munzekonza_random_forest, the $BUILD refers to /scratch/mristin/munzekonza_random_forest/munzekonza_random_forest_build. ### Compilation In case that the pre-compiled binaries do not work for you out-of-the box, they can be compiled from the sources. You need to adjust the paths in the file$REPO/Makefile.inc to suit your system. Once you installed all the dependencies, change BUILD variable in $REPO/Makefile.inc to point where the library should be built into. If the directory does not exist, it will be automatically created. We will refer to this directory as$BUILD.

From the directory $REPO where you cloned the repository, issue the make all command to build everything: cd$REPO
make all


### Export Paths

Since Matlab can not infer automatically the location with the built binaries, you need to change your LD_LIBRARY_PATH and add the BUILD directory to it:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$BUILD


where $BUILD is the BUILD directory you set above. You can add the line above into your ~/.bashrc so that you do not have to issue the export command every time you start a new bash console. Last, we need to let the Matlab know where related mex and .m files are. We can do that either by extending the environment variable MATLABPATH: export MATLABPATH=$MATLABPATH:$BUILD:$REPO/src/matlab


or by invoking addpath(...) command from within Matlab:

addpath('$BUILD'); addpath('$REPO/src/matlab');


Please replace $BUILD and$REPO with the actual directories. On our system, the relevant part of the ~/.bashrc looks like this:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/mristin/projects/munzekonza_random_forest/lib export MATLABPATH=$MATLABPATH:/home/mristin/projects/munzekonza_random_forest/lib:/home/mristin/projects/munzekonza_random_forest/src/matlab


where $REPO is set to /home/mristin/projects/munzekonza_random_forest and the build directory to /home/mristin/projects/munzekonza_random_forest/lib. ## Usage To test that everything works fine, we include a tiny subset of the ILSVRC 10 dataset (training and test data of random 10 categories). The whole pipeline is demonstrated in$REPO/src/matlab/+munzekonza/ncm_forest_demo.m.

Before running it, change the variable data_dir on line 10 to point to the corresponding directory with the data. The training procedure is set with the variable training (line 40) which defines which forests to train, i.e. NCM Forests ('cvpr14') or regularized NCM Forests ('cvpr15'). The training parameters are set by changing the params variable (lines 44ff).

To invoke the demonstration script, type in the Matlab console:

munzekonza.ncm_forest_demo


In the following, we describe individual parts of the implementation in more detail.

### Data

Matlab uses Column-major storage format for the matrices. Unfortunately, our implementation relies on row-major storage format, so that currently all data needs to be converted. This is done automatically within the wrappers and you do not need to care about that. Yet, keep in mind that current implementation requires the double amount of the memory, as each storage format conversion involves data copies.

In the future versions, we will fix this issue. Please let us know if you need this feature.

### Mex Wrappers

#### munzekonza.save_tree

To save the tree structure, invoke:

munzekonza.save_tree(tree, path)


Input arguments:

• tree pointer to the tree structure obtained by munzekonza.train_ncm_tree.
• path path to where the tree structure needs to be stored.

#### munzekonza.save_ncm_splits

To save the structure with NCM splitting functions, call:

munzekonza.save_ncm_splits(splits, path)


Input arguments:

• splits pointer to the structure containing the splitting functions obtained by munzekonza.train_ncm_tree.
• path path to where the splitting functions needs to be stored.

This command loads a tree structure which was previously trained by munzekonza.train_ncm_tree:

tree = munzekonza.load_tree(path)


Input arguments:

• path path where the tree structure was saved.

Output:

• tree pointer to the loaded tree structure.

This command loads splitting functions which were obtained by munzekonza.train_ncm_tree:

splits = munzekonza.load_ncm_splits(path)


Input arguments:

• path path where the splitting functions were saved.

Output:

• splits pointer to the loaded structure containing the splitting fucntions.

#### munzekonza.destroy_tree

Since our implementation depends on multitude of objects created by C++, they can not be automatically garbage-collected by Matlab and the memory management needs to be performed by the user herself.

To delete the memory used by a tree structure once it is not needed anymore, call:

munzekonza.destroy_tree(tree)


Input arguments:

• tree pointer to the tree structure.

#### munzekonza.destroy_ncm_splits

Analogous to munzekonza.destroy_tree, it deletes the memory occupied by the splitting functions:

munzekonza.destroy_ncm_splits(splits)


Input arguments:

• splits pointer to the structure containing the splitting fucntions.

## Licence

GPLv3: http://gplv3.fsf.org/

All programs in this collection are free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

## Contact

Marko Ristin-Kaufmann (marko.ristin@gmail.com)