Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python. Programmers can use it to easily add search functionality to their applications and websites. Every part of how Whoosh works can be extended or replaced to meet your needs exactly.
Some of Whoosh's features include:
- Pythonic API.
- Pure-Python. No compilation or binary packages needed, no mysterious crashes.
- Fielded indexing and search.
- Fast indexing and retrieval -- faster than any other pure-Python search solution I know of. See Benchmarks.
- Pluggable scoring algorithm (including BM25F), text analysis, storage, posting format, etc.
- Powerful query language.
- Production-quality pure Python spell-checker (as far as I know, the only one).
Whoosh might be useful in the following circumstances:
- Anywhere a pure-Python solution is desirable to avoid having to build/compile native libraries (or force users to build/compile them).
- As a research platform (at least for programmers that find Python easier to read and work with than Java ;)
- When an easy-to-use Pythonic interface is more important to you than raw speed.
- If your application can make good use of one deeply integrated search/lookup solution you can rely on just being there rather than having two different search solutions (a simple/slow/homegrown one integrated, an indexed/fast/external binary dependency one as an option).
Whoosh was created and is maintained by Matt Chaput. It was originally created for use in the online help system of Side Effects Software's 3D animation software Houdini. Side Effects Software Inc. graciously agreed to open-source the code.
- Read the online documentation.
- Join the Whoosh mailing list.
- Whoosh topics on StackOverflow
- Join the
##whooshIRC channel on chat.freenode.net (this is no official support channel). You can use the web chat interface. "Ask/tell AND WAIT" is the motto there.
- Look at Whoosh's Openhub page.
- Read about Use Cases - usages, users, testimonials, etc.
Starting from version 1.8, the software is licensed under the terms of the simplified ("two-clause") BSD License. See LICENSE.txt for details.
Previously, the software was licensed under the terms of the Apache License version 2.
Whoosh is compatible with Python 2.5 and higher. As of Whoosh 2.0 it is compatible with Python 3. If you find a problem with the Python 3 compatibility, please file a bug.
If you have ``setuptools`` or ``pip`` installed, you can use ``easy_install`` or ``pip`` to download and install Whoosh automatically:
$ easy_install Whoosh
$ pip install Whoosh
Getting the source
Download source releases from PyPI at http://pypi.python.org/pypi/Whoosh/
You can check out the latest version of the source code using Mercurial:
hg clone http://bitbucket.org/mchaput/whoosh