|Author:||Grzegorz Chrupała <firstname.lastname@example.org>|
Progressive is a multilabel classification model which learns sequentially (online). The set of labels need not be known in advance: the learner keeps a constantly updated set of top N most frequent labels seen so far and predicts labels from this set.
The package provides the executable progressive. You need the Vowpal Wabbit machine learning toolkit to use progressive. You can compile and install it from source. On Ubuntu or Debian, you can install the vowpal-wabbit package. Either way, make sure you have the Vowpal Wabbit executable (called vw) installed somewhere in your path.
To compile progressive you should first install the Haskell Platform. Once you have it simply do the following:
cabal update cabal install --bindir=DIRECTORY
Replace DIRECTORY with the directory where you want to install the executable. Make sure progressive is in your PATH.
progressive can run in a learning mode which interleaves learning and prediction:
progressive --size SIZE-IN-BITS --max-labels NUMBER-OF-LABELS MODEL-PATH
which runs in learning mode, with model size set to SIZE bits, and saves the model to MODEL-PATH. Progessive mode can also run in in pure prediction mode, where it simply uses a previously learned model to predict labels for new data:
progressive --no-learn MODEL-PATH
uses the model in MODEL-PATH to predict new labels, and does not preform any learning.
For optimum results, use the maximum size of the model allowed: 29 bits. You may need to set it to a lower value if you don't have enough RAM.
Each training example fits on one line, and consists of a number of space-separated fields. The first field contains a comma-separated list of labels. The rest if the fields contain features. A feature is either a string (excluding spaces and colons), or a string followed by a colon, followed by a number. If the feature is just a string, its value is implicitly 1.0. Example:
fun,wierd,misc something this:3 or that:2.0