Source

csearch /

Filename Size Date modified Message
engine
lib
static
test
web
311 B
34.3 KB
110 B
3.6 KB
1.3 KB
1.7 KB
1.7 KB
31 B
2.3 KB
2.7 KB

Overview

CSearch is a simple source code search engine. It is intended to help programmers in a small to medium sized organization maintain an overview over and reuse parts of the various software projects within the organization.

It includes

  • a command line tool that scans through a predefined set of software projects while building an index of the code
  • a small CherryPy based web interface through which one can search for words or regular expressions in the code that has been indexed.

Acknowledgements

This tool uses the free Plex tool written by Greg Ewing. See further details at http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex

Parts of this code is based on example code from "Programming Collective Intelligence" by Toby Segaran. Copyright 2007 Toby Segaran, 978-0-596-52932-1.

Thanks to Jeppe Brønsted <bronsted@bitbucket.org> for writing the incremental search web frontend (e.g. the files in the static dir).

Installation

Before using CSearch you first have to install the 3rd party packages listed in the Requirements section below. Then simply run

python setup.py install

As default CSearch will store the index, configuration files, etc. in [sys.prefix]/csearch. On *nix systems this would typically amount to /usr/csearch. On Windows it would be something like C:Python25csearch.

It will also install some scripts, namely

csearch-index.py - the CSearch indexing tool csearch-search.py - the CSearch command line search tool csearch-server.py - the CSearch web server

On *nix systems they will typically go into /usr/bin and on Windows you'll find them at e.g. C:Python25Scripts.

To index you first have to edit the project.xml file located in the folder described above. When you have edited the project.xml file, you can run

csearch-index.py

This will build an index in the file

[sys.prefix]/csearch/index.db

You can then edit the server configuration file csearch.conf and start the web server with

csearch-server.py

If you e.g. set

host = 127.0.0.1 port = 8080

in the csearch.conf file and start the web server with the command above, a basic web frontend will be available at

http://localhost:8080

and a nice incremential frontend will be available at

http://localhost:8080/inc/

Requirements

The CSearch indexer and web server theese 3rd party packages to run

  • Python 2.5+
  • PySQLite
  • BeautifulSoup (http://www.crummy.com/software/BeautifulSoup)
  • SVN client (if you want to index svn repositories)
  • ssh + rsync (if you want to index remote file systems)
  • CherryPy (if you want to use the web server)

Other source code indexers

Gonzui: http://gonzui.sourceforge.net

Future plans

Here's a list of interesting extensions that haven't been implemented yet - and maybe never will be?

  • Support for CVS and Mercurial.
  • Train a neural network based on user rankings of search results and use the network to rank future search results.
  • Search in project metadata, e.g. the fields in project.xml. E.g. search only a certain project or projects with the maturity attribute set to stable. Use the metadata to rank the search results.
  • Search only code in a certain language, e.g. C++ or Java code.
  • Use boolean operators (AND, OR and NOT) to further specify search.
  • Index previous changesets and comments in revision control systems such as SVN, CVS, and Mercurial.
  • Index all kinds of non-binary files.

Contact

Comments, suggestions, etc. are welcome. Send an email to tpj@cs.au.dk.