1. Mark V
  2. kaggle-planZ

Overview

HTTPS SSH

planZ

Reflection

First Kaggle participation with Tim and Fahrad for Machine Learning in Practice (Radboud University). Using convolutional neural networks with Caffe. Too late and average results at best, but tried and learned a lot. Code 80-90% mine, not made for reuse.

Project

First project for the Radboud University course Pattern Recognition in Practice 2015 for the team ''planZ''.

Usage

  • To create the pre-processed data, run data/pre_process.py. You can see options with:

    python data/prepare.py -h
    

For example (note that is has become significantly slower):

python data/prepare.py --dimension 70 --align --features --mirror 3 --preview
  • To implement a classifier, which will choose at a node in the species tree, subclass BaseClassifier. E.g. nn_caffe/neural_net.py should be implemented for a neural network.

  • Then add it to settings.CLASSIFIERS and create or change species/chooser/YOURNODE.json to use your classifier. E.g.:

    {
            "method": "neural",
            "params": {
              "froink": "tjielp",
              "blop": "pudding"
            }
    }
    
  • Run by executing run.py from the main project directory:

    python2 run/go.py
    
  • When making your own code, make sure that you first use:

    # load the entire species tree and gives you the root element from species.read_tree import get_root root = get_root() # optionally, separates into train, test, validation from data.split import get_train_test_val train, test, validate = get_train_test_val(root.classifier.options) # optional

Caffe (and other requirements)

The installation instructions are at http://caffe.berkeleyvision.org/installation.html . The source code is already in the repository at /caffe/ .

The Ubuntu 14.04 commands to install prerequisite:

sudo apt-get install -y python-dev libblas3gf libblas-doc libblas-dev liblapack3gf liblapack-doc liblapack-dev libatlas-base-dev libopencv-dev libboost-all-dev libprotobuf-dev protobuf-compiler libgoogle-glog-dev libgflags-dev libhdf5-serial-dev libleveldb-dev libsnappy-dev liblmdb-dev graphviz python-numpy python-opencv
sudo pip install --upgrade 'leveldb>=0.191' 'nose>=1.3.0' 'python-dateutil>=1.4,<2' 'protobuf>=2.5.0' 'python-gflags>=2.0' 'pyyaml>=3.10' 'pillow>=2.7.0' 'pyparsing==1.5.7' 'pydot' 'matplotlib' 'simplejson'

(Some of these are for our own scripts rather than for caffe.)

Then update the Pythonpath (replace the_path_to_this_repository_here), which needs to happen every time you open a terminal (add it to e.g. ~/.bashrc file):

export PYTHONPATH=$PYTHONPATH:the_path_to_this_repository_here/caffe/python

To actually build the code, go to the caffe directory. If it is empty, type:

git submodule update --init --recursive
cp ../dev/Makefile.config .

Then cross your fingers and type:

make all -j4
make pycaffe

To see if it worked, you can type:

make test -j4
make runtest  # <-- this should show a lot of ok's
python -c "import caffe"  # <-- this should show no output

Running

The command to run is:

caffe/build/tools/caffe train -solver nn_caffe/nets/solver6.prototxt

To run it in the background on the server, so it keeps running if you close the connection, you should add (... is the above command):

nohup ... &  # to start
tail -f nohup.log  # to see the output

You can press ctrl+C to stop following the output; the command will keep running. You can check this with top.

Git

See which files have been changes, but not committed (red = not added, green = added but not committed):

git status

Get the changes other people have made:

git pull

If this says something about merge conflicts, check the files it mentions (you can see again with git status). Some extra stuff has been added, try to fix these files then commit changes.

To send changes you have made, you have to do three things: add all the changed files, create a commit (which is like a chapter) and send all the commits:

git add --all
git commit -m "type a short description here"
git push -u origin master