Overview

Binary Tree Package

Abstract

This package provides Binary- RedBlack- and AVL-Trees written in Python and Cython.

This Classes are much slower than the built-in dict class, but all iterators/generators yielding data in sorted key order.

Source of Algorithms

AVL- and RBTree algorithms taken from Julienne Walker: http://eternallyconfuzzled.com/jsw_home.aspx

Trees written in Python (only standard library)

  • BinaryTree -- unbalanced binary tree
  • AVLTree -- balanced AVL-Tree
  • RBTree -- balanced Red-Black-Tree

Trees written with C-Functions and Cython as wrapper

  • FastBinaryTree -- unbalanced binary tree
  • FastAVLTree -- balanced AVL-Tree
  • FastRBTree -- balanced Red-Black-Tree

All trees provides the same API, the pickle protocol is supported.

FastXTrees has C-structs as tree-node structure and C-implementation for low level operations: insert, remove, get_value, max_item, min_item.

Constructor

  • Tree() -> new empty tree;
  • Tree(mapping) -> new tree initialized from a mapping (requires only an items() method)
  • Tree(seq) -> new tree initialized from seq [(k1, v1), (k2, v2), ... (kn, vn)]

Methods

  • __contains__(k) -> True if T has a key k, else False, O(log(n))
  • __delitem__(y) <==> del T[y], del[s:e], O(log(n))
  • __getitem__(y) <==> T[y], T[s:e], O(log(n))
  • __iter__() <==> iter(T)
  • __len__() <==> len(T), O(1)
  • __max__() <==> max(T), get max item (k,v) of T, O(log(n))
  • __min__() <==> min(T), get min item (k,v) of T, O(log(n))
  • __and__(other) <==> T & other, intersection
  • __or__(other) <==> T | other, union
  • __sub__(other) <==> T - other, difference
  • __xor__(other) <==> T ^ other, symmetric_difference
  • __repr__() <==> repr(T)
  • __setitem__(k, v) <==> T[k] = v, O(log(n))
  • clear() -> None, remove all items from T, O(n)
  • copy() -> a shallow copy of T, O(n*log(n))
  • discard(k) -> None, remove k from T, if k is present, O(log(n))
  • get(k[,d]) -> T[k] if k in T, else d, O(log(n))
  • is_empty() -> True if len(T) == 0, O(1)
  • items([reverse]) -> generator for (k, v) items of T, O(n)
  • keys([reverse]) -> generator for keys of T, O(n)
  • values([reverse]) -> generator for values of T, O(n)
  • pop(k[,d]) -> v, remove specified key and return the corresponding value, O(log(n))
  • popitem() -> (k, v), remove and return some (key, value) pair as a 2-tuple, O(log(n))
  • setdefault(k[,d]) -> T.get(k, d), also set T[k]=d if k not in T, O(log(n))
  • update(E) -> None. Update T from dict/iterable E, O(E*log(n))
  • foreach(f, [order]) -> visit all nodes of tree (0 = 'inorder', -1 = 'preorder' or +1 = 'postorder') and call f(k, v) for each node, O(n)

slicing by keys

  • itemslice(s, e) -> generator for (k, v) items of T for s <= key < e, O(n)
  • keyslice(s, e) -> generator for keys of T for s <= key < e, O(n)
  • valueslice(s, e) -> generator for values of T for s <= key < e, O(n)
  • T[s:e] -> TreeSlice object, with keys in range s <= key < e, O(n)
  • del T[s:e] -> remove items by key slicing, for s <= key < e, O(n)

start/end parameter:

  • if 's' is None or T[:e] TreeSlice/iterator starts with value of min_key();
  • if 'e' is None or T[s:] TreeSlice/iterator ends with value of max_key();
  • T[:] is a TreeSlice which represents the whole tree;

TreeSlice is a tree wrapper with range check, and contains no references to objects, deleting objects in the associated tree also deletes the object in the TreeSlice.

  • TreeSlice[k] -> get value for key k, raises KeyError if k not exists in range s:e

  • TreeSlice[s1:e1] -> TreeSlice object, with keys in range s1 <= key < e1
    • new lower bound is max(s, s1)
    • new upper bound is min(e, e1)

TreeSlice methods:

  • items() -> generator for (k, v) items of T, O(n)
  • keys() -> generator for keys of T, O(n)
  • values() -> generator for values of T, O(n)
  • __iter__ <==> keys()
  • __repr__ <==> repr(T)
  • __contains__(key)-> True if TreeSlice has a key k, else False, O(log(n))

prev/succ operations

  • prev_item(key) -> get (k, v) pair, where k is predecessor to key, O(log(n))
  • prev_key(key) -> k, get the predecessor of key, O(log(n))
  • succ_item(key) -> get (k,v) pair as a 2-tuple, where k is successor to key, O(log(n))
  • succ_key(key) -> k, get the successor of key, O(log(n))

Heap methods

  • max_item() -> get largest (key, value) pair of T, O(log(n))
  • max_key() -> get largest key of T, O(log(n))
  • min_item() -> get smallest (key, value) pair of T, O(log(n))
  • min_key() -> get smallest key of T, O(log(n))
  • pop_min() -> (k, v), remove item with minimum key, O(log(n))
  • pop_max() -> (k, v), remove item with maximum key, O(log(n))
  • nlargest(i[,pop]) -> get list of i largest items (k, v), O(i*log(n))
  • nsmallest(i[,pop]) -> get list of i smallest items (k, v), O(i*log(n))

Set methods (using frozenset)

  • intersection(t1, t2, ...) -> Tree with keys common to all trees
  • union(t1, t2, ...) -> Tree with keys from either trees
  • difference(t1, t2, ...) -> Tree with keys in T but not any of t1, t2, ...
  • symmetric_difference(t1) -> Tree with keys in either T and t1 but not both
  • issubset(S) -> True if every element in T is in S
  • issuperset(S) -> True if every element in S is in T
  • isdisjoint(S) -> True if T has a null intersection with S

Classmethods

  • fromkeys(S[,v]) -> New tree with keys from S and values equal to v.

Performance

Profiling with timeit(): 5000 unique random int keys, time in seconds

unbalanced Trees CPython 2.7.2 FastBinaryTree ipy 2.7.0 pypy 1.7.0
build time 100x 7,55 0,60 2,51 0,29
build & delete time 100x 13,34 1,48 4,45 0,47
search 100x all keys 2,86 0,96 0,27 0,06
AVLTrees CPython 2.7.2 FastAVLTree ipy 2.7.0 pypy 1.7.0
build time 100x 22,66 0,65 10,45 1,29
build & delete time 100x 36,71 1,47 20,89 3,02
search 100x all keys 2,34 0,85 0,89 0,14
RBTrees CPython 2.7.2 FastRBTree ipy 2.7.0 pypy 1.7.0
build time 100x 14,78 0,65 4,43 0,49
build & delete time 100x 39,34 1,63 12,43 1,32
search 100x all keys 2,32 0,86 0,86 0,13

News

Version 1.0.0

  • bug fixes
  • status: 5 - Production/Stable
  • removed useless TreeIterator() class and T.treeiter() method.
  • patch from Max Motovilov to use Visual Studio 2008 for building C-extensions

Version 0.4.0

  • API change!!!
  • full Python 3 support, also for Cython implementations
  • removed user defined compare() function - keys have to be comparable!
  • removed T.has_key(), use 'key in T'
  • keys(), items(), values() generating 'views'
  • removed iterkeys(), itervalues(), iteritems() methods
  • replaced index slicing by key slicing
  • removed index() and item_at()
  • repr() produces a correct representation
  • installs on systems without cython (tested with pypy)
  • new license: GNU Library or Lesser General Public License (LGPL)

Installation

from source:

python setup.py install

Documentation

this README.txt

bintrees can be found on bitbucket.org at:

http://bitbucket.org/mozman/bintrees