Georg Brandl committed 1cdeb5e

Fix #58: Add extension tutorial and better docs on where to put extensions.

Comments (0)

Files changed (2)

+.. _exttut:
+Tutorial: Writing a simple extension
+This section is intended as a walkthrough for the creation of custom extensions.
+It covers the basics of writing and activating an extensions, as well as
+commonly used features of extensions.
+As an example, we will cover a "todo" extension that adds capabilities to
+include todo entries in the documentation, and collecting these in a central
+place.  (A similar "todo" extension is distributed with Sphinx.)
+Build Phases
+One thing that is vital in order to understand extension mechanisms is the way
+in which a Sphinx project is built: this works in several phases.
+**Phase 0: Initialization**
+   In this phase, almost nothing interesting for us happens.  The source
+   directory is searched for source files, and extensions are initialized.
+   Should a stored build environment exist, it is loaded, otherwise a new one is
+   created.
+**Phase 1: Reading**
+   In Phase 1, all source files (and on subsequent builds, those that are new or
+   changed) are read and parsed.  This is the phase where directives and roles
+   are encountered by the docutils, and the corresponding functions are called.
+   The output of this phase is a *doctree* for each source files, that is a tree
+   of docutils nodes.  For document elements that aren't fully known until all
+   existing files are read, temporary nodes are created.
+   During reading, the build environment is updated with all meta- and cross
+   reference data of the read documents, such as labels, the names of headings,
+   described Python objects and index entries.  This will later be used to
+   replace the temporary nodes.
+   The parsed doctrees are stored on the disk, because it is not possible to
+   hold all of them in memory.
+**Phase 2: Consistency checks**
+   Some checking is done to ensure no surprises in the built documents.
+**Phase 3: Resolving**
+   Now that the metadata and cross-reference data of all existing documents is
+   known, all temporary nodes are replaced by nodes that can be converted into
+   output.  For example, links are created for object references that exist, and
+   simple literal nodes are created for those that don't.
+**Phase 4: Writing**
+   This phase converts the resolved doctrees to the desired output format, such
+   as HTML or LaTeX.  This happens via a so-called docutils writer that visits
+   the individual nodes of each doctree and produces some output in the process.
+.. note::
+   Some builders deviate from this general build plan, for example, the builder
+   that checks external links does not need anything more than the parsed
+   doctrees and therefore does not have phases 2--4.
+Extension Design
+We want the extension to add the following to Sphinx:
+* A "todo" directive, containing some content that is marked with "TODO", and
+  only shown in the output if a new config value is set.  (Todo entries should
+  not be in the output by default.)
+* A "todolist" directive that creates a list of all todo entries throughout the
+  documentation.
+For that, we will need to add the following elements to Sphinx:
+* New directives, called ``todo`` and ``todolist``.
+* New document tree nodes to represent these directives, conventionally also
+  called ``todo`` and ``todolist``.  We wouldn't need new nodes if the new
+  directives only produced some content representable by existing nodes.
+* A new config value ``todo_include_todos`` (config value names should start
+  with the extension name, in order to stay unique) that controls whether todo
+  entries make it into the output.
+* New event handlers: one for the :event:`doctree-resolved` event, to replace
+  the todo and todolist nodes, and one for :event:`env-purge-doc` (the reason
+  for that will be covered later).
+The Setup Function
+.. currentmodule:: sphinx.application
+The new elements are added in the extension's setup function.  Let us create a
+new Python module called :file:`` and add the setup function::
+   def setup(app):
+       app.add_config_value('todo_include_todos', False, False)
+       app.add_node(todolist)
+       app.add_node(todo,
+                    html=(visit_todo_node, depart_todo_node),
+                    latex=(visit_todo_node, depart_todo_node),
+                    text=(visit_todo_node, depart_todo_node))
+       app.add_directive('todo', todo_directive, 1, (0, 0, 1))
+       app.add_directive('todolist', todolist_directive, 0, (0, 0, 0))
+       app.connect('doctree-resolved', process_todo_nodes)
+       app.connect('env-purge-doc', purge_todos)
+The calls in this function refer to classes and functions not yet written.  What
+the individual calls do is the following:
+* :meth:`~Sphinx.add_config_value` lets Sphinx know that it should recognize the
+  new *config value* ``todo_include_todos``, whose default value should be
+  ``False`` (this also tells Sphinx that it is a boolean value).
+  If the third argument was ``True``, all documents would be re-read if the
+  config value changed its value.  This is needed for config values that
+  influence reading (build phase 1).
+* :meth:`~Sphinx.add_node` adds a new *node class* to the build system.  It also
+  can specify visitor functions for each supported output format.  These visitor
+  functions are needed when the new nodes stay until phase 4 -- since the
+  ``todolist`` node is always replaced in phase 3, it doesn't need any.
+  We need to create the two node classes ``todo`` and ``todolist`` later.
+* :meth:`~Sphinx.add_directive` adds a new *directive*, given by name, handler
+  function and two arguments that specify if the directive has content and how
+  many arguments it accepts.
+  The handler functions are created later.
+* Finally, :meth:`~Sphinx.connect` adds an *event handler* to the event whose
+  name is given by the first argument.  The event handler function is called
+  with several arguments which are documented with the event.
+The Node Classes
+Let's start with the node classes::
+   from docutils import nodes
+   class todo(nodes.Admonition, nodes.Element):
+       pass
+   class todolist(nodes.General, nodes.Element):
+       pass
+   def visit_todo_node(self, node):
+       self.visit_admonition(node)
+   def depart_todo_node(self, node):
+       self.depart_admonition(node)
+Node classes usually don't have to do anything except inherit from the standard
+docutils classes defined in :mod:`docutils.nodes`.  ``todo`` inherits from
+``Admonition`` because it should be handled like a note or warning, ``todolist``
+is just a "general" node.
+The Directive Handlers
+A directive handler is a function with a host of arguments, covered in detail in
+the docutils documentation.  It must return a list of nodes.
+The ``todolist`` directive is quite simple::
+   def todolist_directive(name, arguments, options, content, lineno,
+                          content_offset, block_text, state, state_machine):
+       return [todolist('')]
+An instance of our ``todolist`` node class is created and returned.  The
+todolist directive has neither content nor arguments that need to be handled.
+The ``todo`` directive function looks like this::
+   from sphinx.util.compat import make_admonition
+   def todo_directive(name, arguments, options, content, lineno,
+                      content_offset, block_text, state, state_machine):
+       env = state.document.settings.env
+       targetid = "todo-%s" % env.index_num
+       env.index_num += 1
+       targetnode ='', '', ids=[targetid])
+       ad = make_admonition(todo, name, [_('Todo')], options, content, lineno,
+                            content_offset, block_text, state, state_machine)
+       if not hasattr(env, 'todo_all_todos'):
+           env.todo_all_todos = []
+       env.todo_all_todos.append({
+           'docname': env.docname,
+           'lineno': lineno,
+           'todo': ad[0].deepcopy(),
+           'target': targetnode,
+       })
+       return [targetnode] + ad
+Several important things are covered here. First, as you can see, you can refer
+to the build environment instance using ``state.document.settings.env``.
+Then, to act as a link target (from the todolist), the todo directive needs to
+return a target node in addition to the todo node.  The target ID (in HTML, this
+will be the anchor name) is generated by using ``env.index_num`` which is
+persistent between directive calls and therefore leads to unique target names.
+The target node is instantiated without any text (the first two arguments).
+An admonition is created using a standard docutils function (wrapped in Sphinx
+for docutils cross-version compatibility).  The first argument gives the node
+class, in our case ``todo``.  The third argument gives the admonition title (use
+``arguments`` here to let the user specify the title).  A list of nodes is
+returned from ``make_admonition``.
+Then, the todo node is added to the environment.  This is needed to be able to
+create a list of all todo entries throughout the documentation, in the place
+where the author puts a ``todolist`` directive.  For this case, the environment
+attribute ``todo_all_todos`` is used (again, the name should be unique, so it is
+prefixed by the extension name).  It does not exist when a new environment is
+created, so the directive must check and create it if necessary.  Various
+information about the todo entry's location are stored along with a copy of the
+In the last line, the nodes that should be put into the doctree are returned:
+the target node and the admonition node.
+The Event Handlers
+Finally, let's look at the event handlers.  First, the one for the
+:event:`env-purge-doc` event::
+   def purge_todos(app, env, docname):
+       if not hasattr(env, 'todo_all_todos'):
+           return
+       env.todo_all_todos = [todo for todo in env.todo_all_todos
+                             if todo['docname'] != docname]
+Since we store information from source files in the environment, which is
+persistent, it may become out of date when the source file changes.  Therefore,
+before each source file is read, the environment's records of it are cleared,
+and the :event:`env-purge-doc` event gives extensions a chance to do the same.
+Here we clear out all todos whose docname matches the given one from the
+``todo_all_todos`` list.  If there are todos left in the document, they will be
+added again during parsing.
+The other handler belongs to the :event:`doctree-resolved` event.  This event is
+emitted at the end of phase 3 and allows custom resolving to be done::
+   def process_todo_nodes(app, doctree, fromdocname):
+       if not app.config.todo_include_todos:
+           for node in doctree.traverse(todo_node):
+               node.parent.remove(node)
+       # Replace all todolist nodes with a list of the collected todos.
+       # Augment each todo with a backlink to the original location.
+       env = app.builder.env
+       for node in doctree.traverse(todolist):
+           if not app.config.todo_include_todos:
+               node.replace_self([])
+               continue
+           content = []
+           for todo_info in env.todo_all_todos:
+               para = nodes.paragraph()
+               filename = env.doc2path(todo_info['docname'], base=None)
+               description = (
+                   _('(The original entry is located in %s, line %d and can be found ') %
+                   (filename, todo_info['lineno']))
+               para += nodes.Text(description, description)
+               # Create a reference
+               newnode = nodes.reference('', '')
+               innernode = nodes.emphasis(_('here'), _('here'))
+               newnode['refdocname'] = todo_info['docname']
+               newnode['refuri'] = app.builder.get_relative_uri(
+                   fromdocname, todo_info['docname'])
+               newnode['refuri'] += '#' + todo_info['target']['refid']
+               newnode.append(innernode)
+               para += newnode
+               para += nodes.Text('.)', '.)')
+               # Insert into the todolist
+               content.append(todo_info['todo'])
+               content.append(para)
+           node.replace_self(content)
+It is a bit more involved.  If our new "todo_include_todos" config value is
+false, all todo and todolist nodes are removed from the documents.
+If not, todo nodes just stay where and how they are.  Todolist nodes are
+replaced by a list of todo entries, complete with backlinks to the location
+where they come from.  The list items are composed of the nodes from the todo
+entry and docutils nodes created on the fly: a paragraph for each entry,
+containing text that gives the location, and a link (reference node containing
+an italic node) with the backreference.  The reference URI is built by
+``app.builder.get_relative_uri`` which creates a suitable URI depending on the
+used builder, and appending the todo node's (the target's) ID as the anchor
 Since many projects will need special features in their documentation, Sphinx is
 designed to be extensible on several levels.
-First, you can add new :term:`builder`\s to support new output formats or
-actions on the parsed documents.  Then, it is possible to register custom
-reStructuredText roles and directives, extending the markup.  And finally, there
-are so-called "hook points" at strategic places throughout the build process,
-where an extension can register a hook and run specialized code.
+This is what you can do in an extension: First, you can add new
+:term:`builder`\s to support new output formats or actions on the parsed
+documents.  Then, it is possible to register custom reStructuredText roles and
+directives, extending the markup.  And finally, there are so-called "hook
+points" at strategic places throughout the build process, where an extension can
+register a hook and run specialized code.
-The configuration file itself can be an extension, see the :confval:`extensions`
-configuration value docs.
+An extension is simply a Python module.  When an extension is loaded, Sphinx
+imports this module and executes its ``setup()`` function, which in turn
+notifies Sphinx of everything the extension offers -- see the extension tutorial
+for examples.
+The configuration file itself can be treated as an extension if it contains a
+``setup()`` function.  All other extensions to load must be listed in the
+:confval:`extensions` configuration value.
 .. toctree::
+   ext/tutorial
 There are several extensions that are not (yet) maintained in the Sphinx
 distribution.  The `Wiki at BitBucket`_ maintains a list of those.
+If you write an extension that you think others will find useful, please write
+to the project mailing list ( and we'll find the
+proper way of including or hosting it for the public.
 .. _Wiki at BitBucket:
+Where to put your own extensions?
+Extensions local to a project should be put within the project's directory
+structure.  Set Python's module search path, ``sys.path``, accordingly so that
+Sphinx can find them.
+E.g., if your extension ```` lies in the ``exts`` subdirectory of the
+project root, put into :file:``::
+   import sys, os
+   sys.path.append(os.path.abspath('exts'))
+   extensions = ['foo']
+You can also install extensions anywhere else on ``sys.path``, e.g. in the
+``site-packages`` directory.