HTTPS SSH

BKTree

This is an implementation of Burkhard-Keller Tree in C. It also provides an API for Python.

Few requirements for using the library:

  • tokyo-cabinet
  • glib2
  • Python headers (for the Python API)
  • GCC 4.7 or higher

The library stores data in /usr/local/var/bktree directory. Do create it and set write permission for the user.

Building

Create a directory named build or any of your choice in the root source folder of the repository.

cd build
cmake ..
make

Testing

You can test the library and the Python API. To test the library run tests from the build directory:

cd build
./tests

To test the Python API run:

cd build
python tests.py

Python API

The Python API currently exposes these functions:

create_tree

and classes:

BKTree

create_tree

Creates the tree in the filesystem.

create_tree(dbcode, filename)

dbcode - the unique identifier for the tree

filename - the name of the file containing words to be added to the tree. Words should be placed in each line with an optional frequency separated by a space. An example of a file:

    % head ~/Workspace/mquotient/data/tokyo/54
    YAG 235
    RAYACHURI 40
    BHAGESH 396
    JAMANI 39
    SANGMESHWAR 525
    CWA 77
    PROSANT 44
    SYEDNASEER 40
    PONDI 45
    SRIDHER 690

Returns the number of words added to the tree.

BKTree

BKTree(dbcode, access='r')
  • dbcode - the unique identifier for the tree
  • access - r opens the tree for read access, w for write access

Methods

best - Returns a tuple of best match and its score for a given word and its score.

best(word, max_dist=len(word)/2)

query - Returns a list tuple of words and score that are at most max_dist from given word.

query(word, max_dist=len(word)/2, ini_score=0)

add - Add the given word to the tree. Returns True on success.

add(word, freq=1)

delete - Deletes the given word from the tree. Returns True on success.

delete(word)