Whoosh demo application


This is a version of the quick demo application I made for my PyCon 2013 presentation.

The demo source has two halves. It contains code to index the Python documentation, and a simple Flask web app to search and inspect a Whoosh index.



  • Whoosh search library.
  • Flask WSGI microframework.
  • Mako templating engine.
  • tzellman's Flask Mako plugin to integrate Mako with Flask. NOTE: the Flask-Mako package on PyPI will not work.

Get the demo source code

Use Mercurial to clone the demo source code:

% hg clone
% cd whoosh-demo

Index the Python documentation

Use Mercurial to fetch the Python source code (this can take a while):

% hg clone

The demo code includes the ability to search by revision number and/or date (to show the use of NUMERIC and DATETIME fields). Use the script to generate a file mapping filenames to revision numbers:

% python scripts/ cpython/Doc >revs.txt

Run the script to index the Python documentation:

% mkdir index
% python scripts/ cpython/Doc index revs.txt

The index has the following fields.

  • path - the path to the document.
  • title - the document title.
  • tgrams - the document title as N-grams.
  • content - the document content (including the title).
  • chapter - the directory name containing the document.
  • size - the size of the original file in bytes.
  • rev - the Mercurial revision number in which the file was last committed.
  • revised - the date of the revision in which the file was last committed.
  • modref - a reference to a module. For example, searching for modref:hashlib will find files that reference the hashlib module.
  • clsref - a reference to a class.
  • funcref - a reference to a function.
  • pep - a reference to a PEP.
  • cls - the documentation for a class. For example, searching for cls:zipfile will find the documentation for the ZipFile class.
  • mod - the documentation for a module.

Start the web server

Set the WHOOSHINDEX environment variable to the index directory you want to search:

% export DEMOSOURCE=cpython/Doc
    % export DEMOINDEX=index

On Windows:

> set demosource=cpython/Doc
    > set whooshindex=index

Run the script to start the web server:

% python

In a browser go to the following address for the search interface:

There are two additional server apps available:

  • - choose a field and enter text to see the tokens produced by that field's analyzer.
  • - choose a field to see a list of all the indexed terms in that field. WARNING: can produce enourmous, slow-loading pages for fields with thousands of terms (such as tgrams).