django-treebeard / docs / mp_tree.rst

Full commit

Materialized Path trees

This is an efficient implementation of Materialized Path trees for Django 1.3+, as described by Vadim Tropashko in SQL Design Patterns. Materialized Path is probably the fastest way of working with trees in SQL without the need of extra work in the database, like Oracle's CONNECT BY or sprocs and triggers for nested intervals.

In a materialized path approach, every node in the tree will have a :attr:`~MP_Node.path` attribute, where the full path from the root to the node will be stored. This has the advantage of needing very simple and fast queries, at the risk of inconsistency because of the denormalization of parent/child foreign keys. This can be prevented with transactions.

django-treebeard uses a particular approach: every step in the path has a fixed width and has no separators. This makes queries predictable and faster at the cost of using more characters to store a step. To address this problem, every step number is encoded.

Also, two extra fields are stored in every node: :attr:`~MP_Node.depth` and :attr:`~MP_Node.numchild`. This makes the read operations faster, at the cost of a little more maintenance on tree updates/inserts/deletes. Don't worry, even with these extra steps, materialized path is more efficient than other approaches.


The materialized path approach makes heavy use of LIKE in your database, with clauses like WHERE path LIKE '002003%'. If you think that LIKE is too slow, you're right, but in this case the :attr:`~MP_Node.path` field is indexed in the database, and all LIKE clauses that don't start with a % character will use the index. This is what makes the materialized path approach so fast.