Anonymous avatar Anonymous committed df246e3

Initial import of the doc tools.

Comments (0)

Files changed (107)

+PYTHON ?= python
+
+export PYTHONPATH = $(shell echo "$$PYTHONPATH"):./sphinx
+
+.PHONY: all check clean clean-pyc pylint reindent testserver
+
+all: clean-pyc check
+
+check:
+	@$(PYTHON) utils/check_sources.py -i sphinx/style/jquery.js sphinx
+	@$(PYTHON) utils/check_sources.py converter
+
+clean: clean-pyc
+
+clean-pyc:
+	find . -name '*.pyc' -exec rm -f {} +
+	find . -name '*.pyo' -exec rm -f {} +
+	find . -name '*~' -exec rm -f {} +
+
+pylint:
+	@pylint --rcfile utils/pylintrc sphinx converter
+
+reindent:
+	@$(PYTHON) utils/reindent.py -r -B .
+py-rest-doc
+===========
+
+This sandbox project is about moving the official Python documentation
+to reStructuredText.
+
+
+What you need to know
+---------------------
+
+This project uses Python 2.5 features, so you'll need a working Python
+2.5 setup.
+
+If you want code highlighting, you need Pygments >= 0.8, easily
+installable from PyPI.  Jinja, the template engine, is included as a
+SVN external.
+
+For the rest of this document, let's assume that you have a Python
+checkout (you need the 2.6 line, i.e. the trunk) in ~/devel/python and
+this checkout in the current directory.
+
+To convert the LaTeX doc to reST, you first have to apply the patch in
+``etc/inst.diff`` to the ``inst/inst.tex`` LaTeX file in the Python
+checkout::
+
+   patch -d ~/devel/python/Doc -p0 < etc/inst.diff
+
+Then, create a target directory for the reST sources and run the
+converter script::
+
+   mkdir sources
+   python convert.py ~/devel/python/Doc sources
+
+This will convert all LaTeX sources to reST files in the ``sources``
+directory.
+
+The ``sources`` directory contains a ``conf.py`` file which contains
+general configuration for the build process, such as the Python
+version that should be shown, or the date format for "last updated on"
+notes.
+
+
+Building the HTML version
+-------------------------
+
+Then, create a target directory and run ::
+
+   mkdir build-html
+   python sphinx-build.py -b html sources build-html
+
+This will create HTML files in the ``build-html`` directory.
+
+The ``build-html`` directory will also contain a ``.doctrees``
+directory, which caches pickles containing the docutils doctrees for
+all source files, as well as an ``environment.pickle`` file that
+collects all meta-information and data that's needed to
+cross-reference the sources and generate indices.
+
+
+Running the online (web) version
+--------------------------------
+
+First, you need to build the source with the "web" builder::
+
+   mkdir build-web
+   python sphinx-build.py -b web sources build-web
+
+This will create files with pickled contents for the web application
+in the target directory.
+
+Then, you can run ::
+
+   python sphinx-web.py build-web
+
+which will start a webserver using wsgiref on ``localhost:3000``.  The
+web application has a configuration file ``build-web/webconf.py``,
+where you can configure the server and port for the application as
+well as different other settings specific to the web app.
+
+Global TODO
+===========
+
+- discuss and debug comments system
+- write new Makefile, handle automatic version info and checkout
+- write a "printable" builder (export to latex, most probably)
+- discuss the default role
+- discuss lib -> ref section move
+- prepare for databases other than sqlite for comments
+- look at the old tools/ scripts, what functionality should be rewritten
+- add search via Xapian?
+- optionally have a contents tree view in the sidebar (AJAX based)?
+
+# -*- coding: utf-8 -*-
+"""
+    Convert the Python documentation to Sphinx
+    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+    :copyright: 2007 by Georg Brandl.
+    :license: Python license.
+"""
+
+import sys
+import os
+
+from converter import convert_dir
+
+if __name__ == '__main__':
+    try:
+        rootdir = sys.argv[1]
+        destdir = os.path.abspath(sys.argv[2])
+    except IndexError:
+        print "usage: convert.py docrootdir destdir"
+        sys.exit()
+
+    assert os.path.isdir(os.path.join(rootdir, 'texinputs'))
+    os.chdir(rootdir)
+    convert_dir(destdir, *sys.argv[3:])

converter/__init__.py

+# -*- coding: utf-8 -*-
+"""
+    Documentation converter - high level functions
+    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+    :copyright: 2007 by Georg Brandl.
+    :license: Python license.
+"""
+
+import sys
+import os
+import glob
+import shutil
+import codecs
+from os import path
+
+from .tokenizer import Tokenizer
+from .latexparser import DocParser
+from .restwriter import RestWriter
+from .filenamemap import (fn_mapping, copyfiles_mapping, newfiles_mapping,
+                          rename_mapping, dirs_to_make, toctree_mapping,
+                          amendments_mapping)
+from .console import red, green
+
+def convert_file(infile, outfile, doraise=True, splitchap=False,
+                 toctree=None, deflang=None, labelprefix=''):
+    inf = codecs.open(infile, 'r', 'latin1')
+    p = DocParser(Tokenizer(inf.read()).tokenize(), infile)
+    if not splitchap:
+        outf = codecs.open(outfile, 'w', 'utf-8')
+    else:
+        outf = None
+    r = RestWriter(outf, splitchap, toctree, deflang, labelprefix)
+    try:
+        r.write_document(p.parse())
+        if splitchap:
+            for i, chapter in enumerate(r.chapters[1:]):
+                coutf = codecs.open('%s/%d_%s' % (
+                    path.dirname(outfile), i+1, path.basename(outfile)),
+                                    'w', 'utf-8')
+                coutf.write(chapter.getvalue())
+                coutf.close()
+        else:
+            outf.close()
+        return 1, r.warnings
+    except Exception, err:
+        if doraise:
+            raise
+        return 0, str(err)
+
+
+def convert_dir(outdirname, *args):
+    # make directories
+    for dirname in dirs_to_make:
+        try:
+            os.mkdir(path.join(outdirname, dirname))
+        except OSError:
+            pass
+
+    # copy files (currently only non-tex includes)
+    for oldfn, newfn in copyfiles_mapping.iteritems():
+        newpathfn = path.join(outdirname, newfn)
+        globfns = glob.glob(oldfn)
+        if len(globfns) == 1 and not path.isdir(newpathfn):
+            shutil.copyfile(globfns[0], newpathfn)
+        else:
+            for globfn in globfns:
+                shutil.copyfile(globfn, path.join(newpathfn,
+                                                  path.basename(globfn)))
+
+    # convert tex files
+    # "doc" is not converted. It must be rewritten anyway.
+    for subdir in ('api', 'dist', 'ext', 'inst', 'commontex',
+                   'lib', 'mac', 'ref', 'tut', 'whatsnew'):
+        if args and subdir not in args:
+            continue
+        if subdir not in fn_mapping:
+            continue
+        newsubdir = fn_mapping[subdir]['__newname__']
+        deflang = fn_mapping[subdir].get('__defaulthighlightlang__')
+        labelprefix = fn_mapping[subdir].get('__labelprefix__', '')
+        for filename in sorted(os.listdir(subdir)):
+            if not filename.endswith('.tex'):
+                continue
+            filename = filename[:-4] # strip extension
+            newname = fn_mapping[subdir][filename]
+            if newname is None:
+                continue
+            if newname.endswith(':split'):
+                newname = newname[:-6]
+                splitchap = True
+            else:
+                splitchap = False
+            if '/' not in newname:
+                outfilename = path.join(outdirname, newsubdir, newname + '.rst')
+            else:
+                outfilename = path.join(outdirname, newname + '.rst')
+            toctree = toctree_mapping.get(path.join(subdir, filename))
+            infilename = path.join(subdir, filename + '.tex')
+            print green(infilename),
+            success, state = convert_file(infilename, outfilename, False,
+                                          splitchap, toctree, deflang, labelprefix)
+            if not success:
+                print red("ERROR:")
+                print red("    " + state)
+            else:
+                if state:
+                    print "warnings:"
+                    for warning in state:
+                        print "    " + warning
+
+    # rename files, e.g. splitted ones
+    for oldfn, newfn in rename_mapping.iteritems():
+        try:
+            if newfn is None:
+                os.unlink(path.join(outdirname, oldfn))
+            else:
+                os.rename(path.join(outdirname, oldfn),
+                          path.join(outdirname, newfn))
+        except OSError, err:
+            if err.errno == 2:
+                continue
+            raise
+
+    # copy new files
+    srcdirname = path.join(path.dirname(__file__), 'newfiles')
+    for fn, newfn in newfiles_mapping.iteritems():
+        shutil.copyfile(path.join(srcdirname, fn),
+                        path.join(outdirname, newfn))
+
+    # make amendments
+    for newfn, (pre, post) in amendments_mapping.iteritems():
+        fn = path.join(outdirname, newfn)
+        try:
+            ft = open(fn).read()
+        except Exception, err:
+            print "Error making amendments to %s: %s" % (newfn, err)
+            continue
+        else:
+            fw = open(fn, 'w')
+            fw.write(pre)
+            fw.write(ft)
+            fw.write(post)
+            fw.close()

converter/console.py

+# -*- coding: utf-8 -*-
+"""
+    Console utils
+    ~~~~~~~~~~~~~
+
+    Format colored console output.
+
+    :copyright: 1998-2004 by the Gentoo Foundation.
+    :copyright: 2006-2007 by Georg Brandl.
+    :license: GNU GPL.
+"""
+
+esc_seq = "\x1b["
+
+codes = {}
+codes["reset"]     = esc_seq + "39;49;00m"
+
+codes["bold"]      = esc_seq + "01m"
+codes["faint"]     = esc_seq + "02m"
+codes["standout"]  = esc_seq + "03m"
+codes["underline"] = esc_seq + "04m"
+codes["blink"]     = esc_seq + "05m"
+codes["overline"]  = esc_seq + "06m"  # Who made this up? Seriously.
+
+ansi_color_codes = []
+for x in xrange(30, 38):
+    ansi_color_codes.append("%im" % x)
+    ansi_color_codes.append("%i;01m" % x)
+
+rgb_ansi_colors = [
+    '0x000000', '0x555555', '0xAA0000', '0xFF5555',
+    '0x00AA00', '0x55FF55', '0xAA5500', '0xFFFF55',
+    '0x0000AA', '0x5555FF', '0xAA00AA', '0xFF55FF',
+    '0x00AAAA', '0x55FFFF', '0xAAAAAA', '0xFFFFFF'
+]
+
+for x in xrange(len(rgb_ansi_colors)):
+    codes[rgb_ansi_colors[x]] = esc_seq + ansi_color_codes[x]
+
+del x
+
+codes["black"]     = codes["0x000000"]
+codes["darkgray"]  = codes["0x555555"]
+
+codes["red"]       = codes["0xFF5555"]
+codes["darkred"]   = codes["0xAA0000"]
+
+codes["green"]     = codes["0x55FF55"]
+codes["darkgreen"] = codes["0x00AA00"]
+
+codes["yellow"]    = codes["0xFFFF55"]
+codes["brown"]     = codes["0xAA5500"]
+
+codes["blue"]      = codes["0x5555FF"]
+codes["darkblue"]  = codes["0x0000AA"]
+
+codes["fuchsia"]   = codes["0xFF55FF"]
+codes["purple"]    = codes["0xAA00AA"]
+
+codes["teal"]      = codes["0x00AAAA"]
+codes["turquoise"] = codes["0x55FFFF"]
+
+codes["white"]     = codes["0xFFFFFF"]
+codes["lightgray"] = codes["0xAAAAAA"]
+
+codes["darkteal"]   = codes["turquoise"]
+codes["darkyellow"] = codes["brown"]
+codes["fuscia"]     = codes["fuchsia"]
+codes["white"]      = codes["bold"]
+
+def nocolor():
+    "turn off colorization"
+    for code in codes:
+        codes[code] = ""
+
+def reset_color():
+    return codes["reset"]
+
+def colorize(color_key, text):
+    return codes[color_key] + text + codes["reset"]
+
+functions_colors = [
+    "bold", "white", "teal", "turquoise", "darkteal",
+    "fuscia", "fuchsia", "purple", "blue", "darkblue",
+    "green", "darkgreen", "yellow", "brown",
+    "darkyellow", "red", "darkred"
+]
+
+def create_color_func(color_key):
+    """
+    Return a function that formats its argument in the given color.
+    """
+    def derived_func(text):
+        return colorize(color_key, text)
+    return derived_func
+
+ns = locals()
+for c in functions_colors:
+    ns[c] = create_color_func(c)
+
+del c, ns

converter/docnodes.py

+# -*- coding: utf-8 -*-
+"""
+    Python documentation LaTeX parser - document nodes
+    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+    :copyright: 2007 by Georg Brandl.
+    :license: Python license.
+"""
+
+
+class DocNode(object):
+    """ A node in the document tree. """
+    def __repr__(self):
+        return '%s()' % self.__class__.__name__
+
+    def __str__(self):
+        raise RuntimeError('cannot stringify docnodes')
+
+    def walk(self):
+        return []
+
+
+class CommentNode(DocNode):
+    """ A comment. """
+    def __init__(self, comment):
+        assert isinstance(comment, basestring)
+        self.comment = comment
+
+    def __repr__(self):
+        return 'CommentNode(%r)' % self.comment
+
+
+class RootNode(DocNode):
+    """ A whole document. """
+    def __init__(self, filename, children):
+        self.filename = filename
+        self.children = children
+        self.params = {}
+        self.labels = {}
+
+    def __repr__(self):
+        return 'RootNode(%r, %r)' % (self.filename, self.children)
+
+    def walk(self):
+        return self.children
+
+    def transform(self):
+        """ Do restructurings not possible during parsing. """
+        def do_descenvs(node):
+            r""" Make \xxxlines an attribute of the parent xxxdesc node. """
+            for subnode in node.walk():
+                do_descenvs(subnode)
+            if isinstance(node, DescEnvironmentNode):
+                for subnode in node.content.walk():
+                    if isinstance(subnode, DescLineCommandNode):
+                        node.additional.append((subnode.cmdname, subnode.args))
+
+        do_descenvs(self)
+
+
+class NodeList(DocNode, list):
+    """ A list of subnodes. """
+    def __init__(self, children=None):
+        list.__init__(self, children or [])
+
+    def __repr__(self):
+        return 'NL%s' % list.__repr__(self)
+
+    def walk(self):
+        return self
+
+    def append(self, node):
+        assert isinstance(node, DocNode)
+        if type(node) is EmptyNode:
+            return
+        elif self and isinstance(node, TextNode) and \
+                 type(self[-1]) is TextNode:
+            self[-1].text += node.text
+        elif type(node) is NodeList:
+            list.extend(self, node)
+        elif type(node) is VerbatimNode and self and \
+                 isinstance(self[-1], ParaSepNode):
+            # don't allow a ParaSepNode before VerbatimNode
+            # because this breaks ReST's '::'
+            self[-1] = node
+        else:
+            list.append(self, node)
+
+    def flatten(self):
+        if len(self) > 1:
+            return self
+        elif len(self) == 1:
+            return self[0]
+        else:
+            return EmptyNode()
+
+
+class ParaSepNode(DocNode):
+    """ A node for paragraph separator. """
+    def __repr__(self):
+        return 'Para'
+
+
+class TextNode(DocNode):
+    """ A node containing text. """
+    def __init__(self, text):
+        assert isinstance(text, basestring)
+        self.text = text
+
+    def __repr__(self):
+        if type(self) is TextNode:
+            return 'T%r' % self.text
+        else:
+            return '%s(%r)' % (self.__class__.__name__, self.text)
+
+
+class EmptyNode(TextNode):
+    """ An empty node. """
+    def __init__(self, *args):
+        self.text = ''
+
+
+class NbspNode(TextNode):
+    """ A non-breaking space. """
+    def __init__(self, *args):
+        # this breaks ReST markup (!)
+        #self.text = u'\N{NO-BREAK SPACE}'
+        self.text = ' '
+
+    def __repr__(self):
+        return 'NBSP'
+
+
+simplecmd_mapping = {
+    'ldots': u'...',
+    'moreargs': '...',
+    'unspecified': '...',
+    'ASCII': 'ASCII',
+    'UNIX': 'Unix',
+    'Unix': 'Unix',
+    'POSIX': 'POSIX',
+    'LaTeX': 'LaTeX',
+    'EOF': 'EOF',
+    'Cpp': 'C++',
+    'C': 'C',
+    'sub': u'--> ',
+    'textbackslash': '\\\\',
+    'textunderscore': '_',
+    'texteuro': u'\N{EURO SIGN}',
+    'textasciicircum': u'^',
+    'textasciitilde': u'~',
+    'textgreater': '>',
+    'textless': '<',
+    'textbar': '|',
+    'backslash': '\\\\',
+    'tilde': '~',
+    'copyright': u'\N{COPYRIGHT SIGN}',
+    # \e is mostly inside \code and therefore not escaped.
+    'e': '\\',
+    'infinity': u'\N{INFINITY}',
+    'plusminus': u'\N{PLUS-MINUS SIGN}',
+    'leq': u'\N{LESS-THAN OR EQUAL TO}',
+    'geq': u'\N{GREATER-THAN OR EQUAL TO}',
+    'pi': u'\N{GREEK SMALL LETTER PI}',
+    'AA': u'\N{LATIN CAPITAL LETTER A WITH RING ABOVE}',
+}
+
+class SimpleCmdNode(TextNode):
+    """ A command resulting in simple text. """
+    def __init__(self, cmdname, args):
+        self.text = simplecmd_mapping[cmdname]
+
+
+class BreakNode(DocNode):
+    """ A line break. """
+    def __repr__(self):
+        return 'BR'
+
+
+class CommandNode(DocNode):
+    """ A general command. """
+    def __init__(self, cmdname, args):
+        self.cmdname = cmdname
+        self.args = args
+
+    def __repr__(self):
+        return '%s(%r, %r)' % (self.__class__.__name__, self.cmdname, self.args)
+
+    def walk(self):
+        return self.args
+
+
+class DescLineCommandNode(CommandNode):
+    """ A \\xxxline command. """
+
+
+class InlineNode(CommandNode):
+    """ A node with inline markup. """
+    def walk(self):
+        return []
+
+
+class IndexNode(InlineNode):
+    """ An index-generating command. """
+    def __init__(self, cmdname, args):
+        self.cmdname = cmdname
+        # tricky -- this is to make this silent in paragraphs
+        # while still generating index entries for textonly()
+        self.args = []
+        self.indexargs = args
+
+
+class SectioningNode(CommandNode):
+    """ A heading node. """
+
+
+class EnvironmentNode(DocNode):
+    """ An environment. """
+    def __init__(self, envname, args, content):
+        self.envname = envname
+        self.args = args
+        self.content = content
+
+    def __repr__(self):
+        return 'EnvironmentNode(%r, %r, %r)' % (self.envname,
+                                                self.args, self.content)
+
+    def walk(self):
+        return [self.content]
+
+
+class DescEnvironmentNode(EnvironmentNode):
+    """ An xxxdesc environment. """
+    def __init__(self, envname, args, content):
+        self.envname = envname
+        self.args = args
+        self.additional = []
+        self.content = content
+
+    def __repr__(self):
+        return 'DescEnvironmentNode(%r, %r, %r)' % (self.envname,
+                                                    self.args, self.content)
+
+
+class TableNode(EnvironmentNode):
+    def __init__(self, numcols, headings, lines):
+        self.numcols = numcols
+        self.headings = headings
+        self.lines = lines
+
+    def __repr__(self):
+        return 'TableNode(%r, %r, %r)' % (self.numcols,
+                                          self.headings, self.lines)
+
+    def walk(self):
+        return []
+
+
+class VerbatimNode(DocNode):
+    """ A verbatim code block. """
+    def __init__(self, content):
+        self.content = content
+
+    def __repr__(self):
+        return 'VerbatimNode(%r)' % self.content
+
+
+class ListNode(DocNode):
+    """ A list. """
+    def __init__(self, items):
+        self.items = items
+
+    def __repr__(self):
+        return '%s(%r)' % (self.__class__.__name__, self.items)
+
+    def walk(self):
+        return [item[1] for item in self.items]
+
+
+class ItemizeNode(ListNode):
+    """ An enumeration with bullets. """
+
+
+class EnumerateNode(ListNode):
+    """ An enumeration with numbers. """
+
+
+class DescriptionNode(ListNode):
+    """ A description list. """
+
+
+class DefinitionsNode(ListNode):
+    """ A definition list. """
+
+
+class ProductionListNode(ListNode):
+    """ A grammar production list. """

converter/filenamemap.py

+# -*- coding: utf-8 -*-
+"""
+    Map LaTeX filenames to ReST filenames
+    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+    :copyright: 2007 by Georg Brandl.
+    :license: Python license.
+"""
+
+# '' means: use same name, strip prefix if applicable.
+# None means: don't translate at all.
+
+_mapping = {
+    'lib': {
+        '__newname__' : 'modules',
+
+        'asttable': '',
+        'compiler': '',
+        'distutils': '',
+        'email': '',
+        'emailcharsets': 'email.charset',
+        'emailencoders': 'email.encoders',
+        'emailexc': 'email.errors',
+        'emailgenerator': 'email.generator',
+        'emailheaders': 'email.header',
+        'emailiter': 'email.iterators',
+        'emailmessage': 'email.message',
+        'emailmimebase': 'email.mime',
+        'emailparser': 'email.parser',
+        'emailutil': 'email.util',
+        'libaifc': '',
+        'libanydbm': '',
+        'libarray': '',
+        'libascii': 'curses.ascii',
+        'libast': '',
+        'libasynchat': '',
+        'libasyncore': '',
+        'libatexit': '',
+        'libaudioop': '',
+        'libbase64': '',
+        'libbasehttp': 'basehttpserver',
+        'libbastion': '',
+        'libbinascii': '',
+        'libbinhex': '',
+        'libbisect': '',
+        'libbltin': '__builtin__',
+        'libbsddb': '',
+        'libbz2': '',
+        'libcalendar': '',
+        'libcfgparser': 'configparser',
+        'libcgihttp': 'cgihttpserver',
+        'libcgi': '',
+        'libcgitb': '',
+        'libchunk': '',
+        'libcmath': '',
+        'libcmd': '',
+        'libcodecs': '',
+        'libcodeop': '',
+        'libcode': '',
+        'libcollections': '',
+        'libcolorsys': '',
+        'libcommands': '',
+        'libcompileall': '',
+        'libcontextlib': '',
+        'libcookielib': '',
+        'libcookie': '',
+        'libcopyreg': 'copy_reg',
+        'libcopy': '',
+        'libcrypt': '',
+        'libcsv': '',
+        'libctypes': '',
+        'libcursespanel': 'curses.panel',
+        'libcurses': '',
+        'libdatetime': '',
+        'libdbhash': '',
+        'libdbm': '',
+        'libdecimal': '',
+        'libdifflib': '',
+        'libdircache': '',
+        'libdis': '',
+        'libdl': '',
+        'libdoctest': '',
+        'libdocxmlrpc': 'docxmlrpcserver',
+        'libdumbdbm': '',
+        'libdummythreading': 'dummy_threading',
+        'libdummythread': 'dummy_thread',
+        'liberrno': '',
+        'libetree': 'xml.etree.elementtree',
+        'libfcntl': '',
+        'libfilecmp': '',
+        'libfileinput': '',
+        'libfnmatch': '',
+        'libformatter': '',
+        'libfpectl': '',
+        'libfpformat': '',
+        'libftplib': '',
+        'libfunctools': '',
+        'libfuture': '__future__',
+        'libgc': '',
+        'libgdbm': '',
+        'libgetopt': '',
+        'libgetpass': '',
+        'libgettext': '',
+        'libglob': '',
+        'libgrp': '',
+        'libgzip': '',
+        'libhashlib': '',
+        'libheapq': '',
+        'libhmac': '',
+        'libhotshot': '',
+        'libhtmllib': '',
+        'libhtmlparser': '',
+        'libhttplib': '',
+        'libimageop': '',
+        'libimaplib': '',
+        'libimgfile': '',
+        'libimghdr': '',
+        'libimp': '',
+        'libinspect': '',
+        'libitertools': '',
+        'libjpeg': '',
+        'libkeyword': '',
+        'liblinecache': '',
+        'liblocale': '',
+        'liblogging': '',
+        'libmailbox': '',
+        'libmailcap': '',
+        'libmain': '__main__',
+        'libmarshal': '',
+        'libmath': '',
+        'libmd5': '',
+        'libmhlib': '',
+        'libmimetools': '',
+        'libmimetypes': '',
+        'libmimewriter': '',
+        'libmimify': '',
+        'libmmap': '',
+        'libmodulefinder': '',
+        'libmsilib': '',
+        'libmsvcrt': '',
+        'libmultifile': '',
+        'libmutex': '',
+        'libnetrc': '',
+        'libnew': '',
+        'libnis': '',
+        'libnntplib': '',
+        'liboperator': '',
+        'liboptparse': '',
+        'libos': '',
+        'libossaudiodev': '',
+        'libparser': '',
+        'libpdb': '',
+        'libpickle': '',
+        'libpickletools': '',
+        'libpipes': '',
+        'libpkgutil': '',
+        'libplatform': '',
+        'libpopen2': '',
+        'libpoplib': '',
+        'libposixpath': 'os.path',
+        'libposix': '',
+        'libpprint': '',
+        'libprofile': '',
+        'libpty': '',
+        'libpwd': '',
+        'libpyclbr': '',
+        'libpycompile': 'py_compile',
+        'libpydoc': '',
+        'libpyexpat': '',
+        'libqueue': '',
+        'libquopri': '',
+        'librandom': '',
+        'libreadline': '',
+        'librepr': '',
+        'libre': '',
+        'libresource': '',
+        'librexec': '',
+        'librfc822': '',
+        'librlcompleter': '',
+        'librobotparser': '',
+        'librunpy': '',
+        'libsched': '',
+        'libselect': '',
+        'libsets': '',
+        'libsgmllib': '',
+        'libsha': '',
+        'libshelve': '',
+        'libshlex': '',
+        'libshutil': '',
+        'libsignal': '',
+        'libsimplehttp': 'simplehttpserver',
+        'libsimplexmlrpc': 'simplexmlrpcserver',
+        'libsite': '',
+        'libsmtpd': '',
+        'libsmtplib': '',
+        'libsndhdr': '',
+        'libsocket': '',
+        'libsocksvr': 'socketserver',
+        'libspwd': '',
+        'libsqlite3': '',
+        'libstat': '',
+        'libstatvfs': '',
+        'libstringio': '',
+        'libstringprep': '',
+        'libstring': '',
+        'libstruct': '',
+        'libsunaudio': '',
+        'libsunau': '',
+        'libsubprocess': '',
+        'libsymbol': '',
+        'libsyslog': '',
+        'libsys': '',
+        'libtabnanny': '',
+        'libtarfile': '',
+        'libtelnetlib': '',
+        'libtempfile': '',
+        'libtermios': '',
+        'libtest': '',
+        'libtextwrap': '',
+        'libthreading': '',
+        'libthread': '',
+        'libtimeit': '',
+        'libtime': '',
+        'libtokenize': '',
+        'libtoken': '',
+        'libtraceback': '',
+        'libtrace': '',
+        'libtty': '',
+        'libturtle': '',
+        'libtypes': '',
+        'libunicodedata': '',
+        'libunittest': '',
+        'liburllib2': '',
+        'liburllib': '',
+        'liburlparse': '',
+        'libuserdict': '',
+        'libuser': '',
+        'libuuid': '',
+        'libuu': '',
+        'libwarnings': '',
+        'libwave': '',
+        'libweakref': '',
+        'libwebbrowser': '',
+        'libwhichdb': '',
+        'libwinreg': '_winreg',
+        'libwinsound': '',
+        'libwsgiref': '',
+        'libxdrlib': '',
+        'libxmllib': '',
+        'libxmlrpclib': '',
+        'libzipfile': '',
+        'libzipimport': '',
+        'libzlib': '',
+        'tkinter': '',
+        'xmldomminidom': 'xml.dom.minidom',
+        'xmldompulldom': 'xml.dom.pulldom',
+        'xmldom': 'xml.dom',
+        'xmletree': 'xml.etree',
+        'xmlsaxhandler': 'xml.sax.handler',
+        'xmlsaxreader': 'xml.sax.reader',
+        'xmlsax': 'xml.sax',
+        'xmlsaxutils': 'xml.sax.utils',
+        'libal': '',
+        'libcd': '',
+        'libfl': '',
+        'libfm': '',
+        'libgl': '',
+        'libposixfile': '',
+
+        # specials
+        'libundoc': '',
+        'libintro': '',
+
+        # -> ref
+        'libconsts': 'reference/consts',
+        'libexcs': 'reference/exceptions',
+        'libfuncs': 'reference/functions',
+        'libobjs': 'reference/objects',
+        'libstdtypes': 'reference/stdtypes',
+
+        # mainfiles
+        'lib': None,
+        'mimelib': None,
+
+        # obsolete
+        'libni': None,
+        'libcmpcache': None,
+        'libcmp': None,
+
+        # chapter overviews
+        'fileformats': '',
+        'filesys': '',
+        'frameworks': '',
+        'i18n': '',
+        'internet': '',
+        'ipc': '',
+        'language': '',
+        'archiving': '',
+        'custominterp': '',
+        'datatypes': '',
+        'development': '',
+        'markup': '',
+        'modules': '',
+        'netdata': '',
+        'numeric': '',
+        'persistence': '',
+        'windows': '',
+        'libsun': '',
+        'libmm': '',
+        'liballos': '',
+        'libcrypto': '',
+        'libsomeos': '',
+        'libsgi': '',
+        'libmisc': '',
+        'libpython': '',
+        'librestricted': '',
+        'libstrings': '',
+        'libunix': '',
+    },
+
+    'ref': {
+        '__newname__': 'reference',
+        'ref': None,
+        'ref1': 'introduction',
+        'ref2': 'lexical_analysis',
+        'ref3': 'datamodel',
+        'ref4': 'executionmodel',
+        'ref5': 'expressions',
+        'ref6': 'simple_stmts',
+        'ref7': 'compound_stmts',
+        'ref8': 'toplevel_components',
+    },
+
+    'tut': {
+        '__newname__': 'tutorial',
+        '__labelprefix__': 'tut-',
+        'tut': 'tutorial:split',
+        'glossary': 'glossary',
+    },
+
+    'api': {
+        '__newname__': 'c-api',
+        '__defaulthighlightlang__': 'c',
+        'api': None,
+
+        'abstract': '',
+        'concrete': '',
+        'exceptions': '',
+        'init': '',
+        'intro': '',
+        'memory': '',
+        'newtypes': '',
+        'refcounting': '',
+        'utilities': '',
+        'veryhigh': '',
+    },
+
+    'ext': {
+        '__newname__': 'extending',
+        '__defaulthighlightlang__': 'c',
+        'ext': None,
+
+        'building': '',
+        'embedding': '',
+        'extending': 'extending',
+        'newtypes': '',
+        'windows': '',
+    },
+
+    'dist': {
+        '__newname__': 'distutils',
+        'dist': 'distutils:split',
+        'sysconfig': '',
+    },
+
+    'mac': {
+        '__newname__': 'macmodules',
+        'mac': None,
+
+        'libaepack': 'aepack',
+        'libaetools': 'aetools',
+        'libaetypes': 'aetypes',
+        'libautogil': 'autogil',
+        'libcolorpicker': 'colorpicker',
+        'libframework': 'framework',
+        'libgensuitemodule': 'gensuitemodule',
+        'libmacic': 'macic',
+        'libmacos': 'macos',
+        'libmacostools': 'macostools',
+        'libmac': 'mac',
+        'libmacui': 'macui',
+        'libminiae': 'miniae',
+        'libscrap': 'scrap',
+        'scripting': '',
+        'toolbox': '',
+        'undoc': '',
+        'using': '',
+
+    },
+
+    'inst': {
+        '__newname__': 'install',
+        '__defaulthighlightlang__': 'none',
+        'inst': 'index',
+    },
+
+    'whatsnew': {
+        '__newname__': 'whatsnew',
+        'whatsnew20': '2.0',
+        'whatsnew21': '2.1',
+        'whatsnew22': '2.2',
+        'whatsnew23': '2.3',
+        'whatsnew24': '2.4',
+        'whatsnew25': '2.5',
+        'whatsnew26': '2.6',
+    },
+
+    'commontex': {
+        '__newname__': '',
+        'boilerplate': None,
+        'patchlevel': None,
+        'copyright': '',
+        'license': '',
+        'reportingbugs': 'bugs',
+    },
+}
+
+fn_mapping = {}
+
+for dir, files in _mapping.iteritems():
+    newmap = fn_mapping[dir] = {}
+    for fn in files:
+        if not fn.startswith('_') and files[fn] == '':
+            if fn.startswith(dir):
+                newmap[fn] = fn[len(dir):]
+            else:
+                newmap[fn] = fn
+        else:
+            newmap[fn] = files[fn]
+
+
+# new directories to create
+dirs_to_make = [
+    'c-api',
+    'data',
+    'distutils',
+    'documenting',
+    'extending',
+    'includes',
+    'includes/sqlite3',
+    'install',
+    'macmodules',
+    'modules',
+    'reference',
+    'tutorial',
+    'whatsnew',
+]
+
+# includefiles for \verbatiminput and \input
+includes_mapping = {
+    '../../Parser/Python.asdl': None,     # XXX
+    '../../Lib/test/exception_hierarchy.txt': None,
+    'emailmessage': 'email.message.rst',
+    'emailparser': 'email.parser.rst',
+    'emailgenerator': 'email.generator.rst',
+    'emailmimebase': 'email.mime.rst',
+    'emailheaders': 'email.header.rst',
+    'emailcharsets': 'email.charset.rst',
+    'emailencoders': 'email.encoders.rst',
+    'emailexc': 'email.errors.rst',
+    'emailutil': 'email.util.rst',
+    'emailiter': 'email.iterators.rst',
+}
+
+# new files to copy from converter/newfiles
+newfiles_mapping = {
+    'conf.py': 'conf.py',
+    'TODO': 'TODO',
+
+    'ref_index.rst': 'reference/index.rst',
+    'tutorial_index.rst': 'tutorial/index.rst',
+    'modules_index.rst': 'modules/index.rst',
+    'mac_index.rst': 'macmodules/index.rst',
+    'ext_index.rst': 'extending/index.rst',
+    'api_index.rst': 'c-api/index.rst',
+    'dist_index.rst': 'distutils/index.rst',
+    'contents.rst': 'contents.rst',
+    'about.rst': 'about.rst',
+
+    'doc.rst': 'documenting/index.rst',
+    'doc_intro.rst': 'documenting/intro.rst',
+    'doc_style.rst': 'documenting/style.rst',
+    'doc_sphinx.rst': 'documenting/sphinx.rst',
+    'doc_rest.rst': 'documenting/rest.rst',
+    'doc_markup.rst': 'documenting/markup.rst',
+}
+
+# copy files from the old doc tree
+copyfiles_mapping = {
+    'api/refcounts.dat': 'data',
+    'lib/email-*.py': 'includes',
+    'lib/minidom-example.py': 'includes',
+    'lib/tzinfo-examples.py': 'includes',
+    'lib/sqlite3/*.py': 'includes/sqlite3',
+    'ext/*.c': 'includes',
+    'ext/*.py': 'includes',
+    'commontex/typestruct.h': 'includes',
+}
+
+# files to rename
+rename_mapping = {
+    'tutorial/1_tutorial.rst': None, # delete
+    'tutorial/2_tutorial.rst': 'tutorial/appetite.rst',
+    'tutorial/3_tutorial.rst': 'tutorial/interpreter.rst',
+    'tutorial/4_tutorial.rst': 'tutorial/introduction.rst',
+    'tutorial/5_tutorial.rst': 'tutorial/controlflow.rst',
+    'tutorial/6_tutorial.rst': 'tutorial/datastructures.rst',
+    'tutorial/7_tutorial.rst': 'tutorial/modules.rst',
+    'tutorial/8_tutorial.rst': 'tutorial/inputoutput.rst',
+    'tutorial/9_tutorial.rst': 'tutorial/errors.rst',
+    'tutorial/10_tutorial.rst': 'tutorial/classes.rst',
+    'tutorial/11_tutorial.rst': 'tutorial/stdlib.rst',
+    'tutorial/12_tutorial.rst': 'tutorial/stdlib2.rst',
+    'tutorial/13_tutorial.rst': 'tutorial/whatnow.rst',
+    'tutorial/14_tutorial.rst': 'tutorial/interactive.rst',
+    'tutorial/15_tutorial.rst': 'tutorial/floatingpoint.rst',
+    'tutorial/16_tutorial.rst': None, # delete
+
+    'distutils/1_distutils.rst': 'distutils/introduction.rst',
+    'distutils/2_distutils.rst': 'distutils/setupscript.rst',
+    'distutils/3_distutils.rst': 'distutils/configfile.rst',
+    'distutils/4_distutils.rst': 'distutils/sourcedist.rst',
+    'distutils/5_distutils.rst': 'distutils/builtdist.rst',
+    'distutils/6_distutils.rst': 'distutils/packageindex.rst',
+    'distutils/7_distutils.rst': 'distutils/uploading.rst',
+    'distutils/8_distutils.rst': 'distutils/examples.rst',
+    'distutils/9_distutils.rst': 'distutils/extending.rst',
+    'distutils/10_distutils.rst': 'distutils/commandref.rst',
+    'distutils/11_distutils.rst': 'distutils/apiref.rst',
+}
+
+# toctree entries
+toctree_mapping = {
+    'mac/scripting': ['gensuitemodule', 'aetools', 'aepack', 'aetypes', 'miniae'],
+    'mac/toolbox': ['colorpicker'],
+    'lib/libstrings': ['string', 're', 'struct', 'difflib', 'stringio', 'textwrap',
+                       'codecs', 'unicodedata', 'stringprep', 'fpformat'],
+    'lib/datatypes': ['datetime', 'calendar', 'collections', 'heapq', 'bisect',
+                      'array', 'sets', 'sched', 'mutex', 'queue', 'weakref',
+                      'userdict', 'types', 'new', 'copy', 'pprint', 'repr'],
+    'lib/numeric': ['math', 'cmath', 'decimal', 'random', 'itertools', 'functools',
+                    'operator'],
+    'lib/netdata': ['email', 'mailcap', 'mailbox', 'mhlib', 'mimetools', 'mimetypes',
+                    'mimewriter', 'mimify', 'multifile', 'rfc822',
+                    'base64', 'binhex', 'binascii', 'quopri', 'uu'],
+    'lib/markup': ['htmlparser', 'sgmllib', 'htmllib', 'pyexpat', 'xml.dom',
+                   'xml.dom.minidom', 'xml.dom.pulldom', 'xml.sax', 'xml.sax.handler',
+                   'xml.sax.utils', 'xml.sax.reader', 'xml.etree.elementtree'],
+    'lib/fileformats': ['csv', 'configparser', 'robotparser', 'netrc', 'xdrlib'],
+    'lib/libcrypto': ['hashlib', 'hmac', 'md5', 'sha'],
+    'lib/filesys': ['os.path', 'fileinput', 'stat', 'statvfs', 'filecmp',
+                    'tempfile', 'glob', 'fnmatch', 'linecache', 'shutil', 'dircache'],
+    'lib/archiving': ['zlib', 'gzip', 'bz2', 'zipfile', 'tarfile'],
+    'lib/persistence': ['pickle', 'copy_reg', 'shelve', 'marshal', 'anydbm',
+                        'whichdb', 'dbm', 'gdbm', 'dbhash', 'bsddb', 'dumbdbm',
+                        'sqlite3'],
+    'lib/liballos': ['os', 'time', 'optparse', 'getopt', 'logging', 'getpass',
+                     'curses', 'curses.ascii', 'curses.panel', 'platform',
+                     'errno', 'ctypes'],
+    'lib/libsomeos': ['select', 'thread', 'threading', 'dummy_thread', 'dummy_threading',
+                      'mmap', 'readline', 'rlcompleter'],
+    'lib/libunix': ['posix', 'pwd', 'spwd', 'grp', 'crypt', 'dl', 'termios', 'tty',
+                    'pty', 'fcntl', 'pipes', 'posixfile', 'resource', 'nis',
+                    'syslog', 'commands'],
+    'lib/ipc': ['subprocess', 'socket', 'signal', 'popen2', 'asyncore', 'asynchat'],
+    'lib/internet': ['webbrowser', 'cgi', 'cgitb', 'wsgiref', 'urllib', 'urllib2',
+                     'httplib', 'ftplib', 'poplib', 'imaplib',
+                     'nntplib', 'smtplib', 'smtpd', 'telnetlib', 'uuid', 'urlparse',
+                     'socketserver', 'basehttpserver', 'simplehttpserver',
+                     'cgihttpserver', 'cookielib', 'cookie', 'xmlrpclib',
+                     'simplexmlrpcserver', 'docxmlrpcserver'],
+    'lib/libmm': ['audioop', 'imageop', 'aifc', 'sunau', 'wave', 'chunk',
+                  'colorsys', 'imghdr', 'sndhdr', 'ossaudiodev'],
+    'lib/i18n': ['gettext', 'locale'],
+    'lib/frameworks': ['cmd', 'shlex'],
+    'lib/development': ['pydoc', 'doctest', 'unittest', 'test'],
+    'lib/libpython': ['sys', '__builtin__', '__main__', 'warnings', 'contextlib',
+                      'atexit', 'traceback', '__future__', 'gc', 'inspect',
+                      'site', 'user', 'fpectl'],
+    'lib/custominterp': ['code', 'codeop'],
+    'lib/librestricted': ['rexec', 'bastion'],
+    'lib/modules': ['imp', 'zipimport', 'pkgutil', 'modulefinder', 'runpy'],
+    'lib/language': ['parser', 'symbol', 'token', 'keyword', 'tokenize',
+                     'tabnanny', 'pyclbr', 'py_compile', 'compileall', 'dis',
+                     'pickletools', 'distutils'],
+    'lib/compiler': ['ast'],
+    'lib/libmisc': ['formatter'],
+    'lib/libsgi': ['al', 'cd', 'fl', 'fm', 'gl', 'imgfile', 'jpeg'],
+    'lib/libsun': ['sunaudio'],
+    'lib/windows': ['msilib', 'msvcrt', '_winreg', 'winsound'],
+}
+
+# map sourcefilename to [pre, post]
+amendments_mapping = {
+    'license.rst': ['''\
+.. highlightlang:: none
+
+*******************
+History and License
+*******************
+
+''', ''],
+
+    'bugs.rst': ['''\
+**************
+Reporting Bugs
+**************
+
+''', ''],
+
+    'copyright.rst': ['''\
+*********
+Copyright
+*********
+
+''', ''],
+
+    'install/index.rst': ['''\
+.. _install-index:
+
+''', ''],
+}

converter/latexparser.py

+# -*- coding: utf-8 -*-
+"""
+    Python documentation LaTeX file parser
+    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+    For more documentation, look into the ``restwriter.py`` file.
+
+    :copyright: 2007 by Georg Brandl.
+    :license: Python license.
+"""
+
+from .docnodes import CommentNode, RootNode, NodeList, ParaSepNode, \
+     TextNode, EmptyNode, NbspNode, SimpleCmdNode, BreakNode, CommandNode, \
+     DescLineCommandNode, InlineNode, IndexNode, SectioningNode, \
+     EnvironmentNode, DescEnvironmentNode, TableNode, VerbatimNode, \
+     ListNode, ItemizeNode, EnumerateNode, DescriptionNode, \
+     DefinitionsNode, ProductionListNode
+
+from .util import umlaut, empty
+
+
+class ParserError(Exception):
+    def __init__(self, msg, lineno):
+        Exception.__init__(self, msg, lineno)
+
+    def __str__(self):
+        return '%s, line %s' % self.args
+
+
+def generic_command(name, argspec, nodetype=CommandNode):
+    def handle(self):
+        args = self.parse_args('\\'+name, argspec)
+        return nodetype(name, args)
+    return handle
+
+def sectioning_command(name):
+    """ Special handling for sectioning commands: move labels directly following
+        a sectioning command before it, as required by reST. """
+    def handle(self):
+        args = self.parse_args('\\'+name, 'M')
+        snode = SectioningNode(name, args)
+        for l, t, v, r in self.tokens:
+            if t == 'command' and v == 'label':
+                largs = self.parse_args('\\label', 'T')
+                snode.args[0] = NodeList([snode.args[0], CommandNode('label', largs)])
+                break
+            if t == 'text':
+                if not v.strip():
+                    # discard whitespace; after a section that's no problem
+                    continue
+            self.tokens.push((l, t, v, r))
+            break
+        # no label followed
+        return snode
+    return handle
+
+def generic_environment(name, argspec, nodetype=EnvironmentNode):
+    def handle(self):
+        args = self.parse_args(name, argspec)
+        return nodetype(name, args, self.parse_until(self.environment_end))
+    return handle
+
+
+class DocParserMeta(type):
+    def __init__(cls, name, bases, dict):
+        for nodetype, commands in cls.generic_commands.iteritems():
+            for cmdname, argspec in commands.iteritems():
+                setattr(cls, 'handle_' + cmdname,
+                        generic_command(cmdname, argspec, nodetype))
+
+        for cmdname in cls.sectioning_commands:
+            setattr(cls, 'handle_' + cmdname, sectioning_command(cmdname))
+
+        for nodetype, envs in cls.generic_envs.iteritems():
+            for envname, argspec in envs.iteritems():
+                setattr(cls, 'handle_%s_env' % envname,
+                        generic_environment(envname, argspec, nodetype))
+
+
+class DocParser(object):
+    """ Parse a Python documentation LaTeX file. """
+    __metaclass__ = DocParserMeta
+
+    def __init__(self, tokenstream, filename):
+        self.tokens = tokenstream
+        self.filename = filename
+
+    def parse(self):
+        self.rootnode = RootNode(self.filename, None)
+        self.rootnode.children = self.parse_until(None)
+        self.rootnode.transform()
+        return self.rootnode
+
+    def parse_until(self, condition=None, endatbrace=False):
+        nodelist = NodeList()
+        bracelevel = 0
+        for l, t, v, r in self.tokens:
+            if condition and condition(t, v, bracelevel):
+                return nodelist.flatten()
+            if t == 'command':
+                if len(v) == 1 and not v.isalpha():
+                    nodelist.append(self.handle_special_command(v))
+                    continue
+                handler = getattr(self, 'handle_' + v, None)
+                if not handler:
+                    raise ParserError('no handler for \\%s command' % v, l)
+                nodelist.append(handler())
+            elif t == 'bgroup':
+                bracelevel += 1
+            elif t == 'egroup':
+                if bracelevel == 0 and endatbrace:
+                    return nodelist.flatten()
+                bracelevel -= 1
+            elif t == 'comment':
+                nodelist.append(CommentNode(v))
+            elif t == 'tilde':
+                nodelist.append(NbspNode())
+            elif t == 'mathmode':
+                pass # ignore math mode
+            elif t == 'parasep':
+                nodelist.append(ParaSepNode())
+            else:
+                # includes 'boptional' and 'eoptional' which don't have a
+                # special meaning in text
+                nodelist.append(TextNode(v))
+        return nodelist.flatten()
+
+    def parse_args(self, cmdname, argspec):
+        """ Helper to parse arguments of a command. """
+        # argspec: M = mandatory, T = mandatory, check text-only,
+        #          O = optional, Q = optional, check text-only
+        args = []
+        def optional_end(type, value, bracelevel):
+            return type == 'eoptional' and bracelevel == 0
+
+        for i, c in enumerate(argspec):
+            assert c in 'OMTQ'
+            nextl, nextt, nextv, nextr = self.tokens.pop()
+            while nextt == 'comment' or (nextt == 'text' and nextv.isspace()):
+                nextl, nextt, nextv, nextr = self.tokens.pop()
+
+            if c in 'OQ':
+                if nextt == 'boptional':
+                    arg = self.parse_until(optional_end)
+                    if c == 'Q' and not isinstance(arg, TextNode):
+                        raise ParserError('%s: argument %d must be text only' %
+                                          (cmdname, i), nextl)
+                    args.append(arg)
+                else:
+                    # not given
+                    args.append(EmptyNode())
+                    self.tokens.push((nextl, nextt, nextv, nextr))
+                continue
+
+            if nextt == 'bgroup':
+                arg = self.parse_until(None, endatbrace=True)
+                if c == 'T' and not isinstance(arg, TextNode):
+                    raise ParserError('%s: argument %d must be text only' %
+                                      (cmdname, i), nextl)
+                args.append(arg)
+            else:
+                if nextt != 'text':
+                    raise ParserError('%s: non-grouped non-text arguments not '
+                                      'supported' % cmdname, nextl)
+                args.append(TextNode(nextv[0]))
+                self.tokens.push((nextl, nextt, nextv[1:], nextr[1:]))
+        return args
+
+    sectioning_commands = [
+        'chapter',
+        'chapter*',
+        'section',
+        'subsection',
+        'subsubsection',
+        'paragraph',
+    ]
+
+    generic_commands = {
+        CommandNode: {
+            'label': 'T',
+
+            'localmoduletable': '',
+            'verbatiminput': 'T',
+            'input': 'T',
+            'centerline': 'M',
+
+            # Pydoc specific commands
+            'versionadded': 'OT',
+            'versionchanged': 'OT',
+            'deprecated': 'TM',
+            'XX' 'X': 'M',  # used in dist.tex ;)
+
+            # module-specific
+            'declaremodule': 'QTT',
+            'platform': 'T',
+            'modulesynopsis': 'M',
+            'moduleauthor': 'TT',
+            'sectionauthor': 'TT',
+
+            # reference lists
+            'seelink': 'TMM',
+            'seemodule': 'QTM',
+            'seepep': 'TMM',
+            'seerfc': 'TTM',
+            'seetext': 'M',
+            'seetitle': 'OMM',
+            'seeurl': 'MM',
+        },
+
+        DescLineCommandNode: {
+            # additional items for ...desc
+            'funcline': 'TM',
+            'funclineni': 'TM',
+            'methodline': 'QTM',
+            'methodlineni': 'QTM',
+            'memberline': 'QT',
+            'memberlineni': 'QT',
+            'dataline': 'T',
+            'datalineni': 'T',
+            'cfuncline': 'MTM',
+            'cmemberline': 'TTT',
+            'csimplemacroline': 'T',
+            'ctypeline': 'QT',
+            'cvarline': 'TT',
+        },
+
+        InlineNode: {
+            # specials
+            'footnote': 'M',
+            'frac': 'TT',
+            'refmodule': 'QT',
+            'citetitle': 'QT',
+            'ulink': 'MT',
+            'url': 'M',
+
+            # mapped to normal
+            'textrm': 'M',
+            'b': 'M',
+            'email': 'M', # email addresses are recognized by ReST
+
+            # mapped to **strong**
+            'textbf': 'M',
+            'strong': 'M',
+
+            # mapped to *emphasized*
+            'textit': 'M',
+            'emph': 'M',
+
+            # mapped to ``code``
+            'bfcode': 'M',
+            'code': 'M',
+            'samp': 'M',
+            'character': 'M',
+            'texttt': 'M',
+
+            # mapped to `default role`
+            'var': 'M',
+
+            # mapped to [brackets]
+            'optional': 'M',
+
+            # mapped to :role:`text`
+            'cdata': 'M',
+            'cfunction': 'M',      # -> :cfunc:
+            'class': 'M',
+            'command': 'M',
+            'constant': 'M',       # -> :const:
+            'csimplemacro': 'M',   # -> :cmacro:
+            'ctype': 'M',
+            'data': 'M',           # NEW
+            'dfn': 'M',
+            'envvar': 'M',
+            'exception': 'M',      # -> :exc:
+            'file': 'M',
+            'filenq': 'M',
+            'filevar': 'M',
+            'function': 'M',       # -> :func:
+            'grammartoken': 'M',   # -> :token:
+            'guilabel': 'M',
+            'kbd': 'M',
+            'keyword': 'M',
+            'mailheader': 'M',
+            'makevar': 'M',
+            'manpage': 'MM',
+            'member': 'M',
+            'menuselection': 'M',
+            'method': 'M',         # -> :meth:
+            'mimetype': 'M',
+            'module': 'M',         # -> :mod:
+            'newsgroup': 'M',
+            'option': 'M',
+            'pep': 'M',
+            'program': 'M',
+            'programopt': 'M',     # -> :option:
+            'longprogramopt': 'M', # -> :option:
+            'ref': 'T',
+            'regexp': 'M',
+            'rfc': 'M',
+            'token': 'M',
+
+            'NULL': '',
+            # these are defined via substitutions
+            'shortversion': '',
+            'version': '',
+            'today': '',
+        },
+
+        SimpleCmdNode: {
+            # these are directly mapped to text
+            'AA': '', # A as in Angstrom
+            'ASCII': '',
+            'C': '',
+            'Cpp': '',
+            'EOF': '',
+            'LaTeX': '',
+            'POSIX': '',
+            'UNIX': '',
+            'Unix': '',
+            'backslash': '',
+            'copyright': '',
+            'e': '', # backslash
+            'geq': '',
+            'infinity': '',
+            'ldots': '',
+            'leq': '',
+            'moreargs': '',
+            'pi': '',
+            'plusminus': '',
+            'sub': '', # menu separator
+            'textbackslash': '',
+            'textunderscore': '',
+            'texteuro': '',
+            'textasciicircum': '',
+            'textasciitilde': '',
+            'textgreater': '',
+            'textless': '',
+            'textbar': '',
+            'tilde': '',
+            'unspecified': '',
+        },
+
+        IndexNode: {
+            'bifuncindex': 'T',
+            'exindex': 'T',
+            'kwindex': 'T',
+            'obindex': 'T',
+            'opindex': 'T',
+            'refmodindex': 'T',
+            'refexmodindex': 'T',
+            'refbimodindex': 'T',
+            'refstmodindex': 'T',
+            'stindex': 'T',
+            'index': 'M',
+            'indexii': 'TT',
+            'indexiii': 'TTT',
+            'indexiv': 'TTTT',
+            'ttindex': 'T',
+            'withsubitem': 'TM',
+        },
+
+        # These can be safely ignored
+        EmptyNode: {
+            'setindexsubitem': 'T',
+            'tableofcontents': '',
+            'makeindex': '',
+            'makemodindex': '',
+            'maketitle': '',
+            'appendix': '',
+            'documentclass': 'OM',
+            'usepackage': 'OM',
+            'noindent': '',
+            'protect': '',
+            'ifhtml': '',
+            'fi': '',
+        },
+    }
+
+    generic_envs = {
+        EnvironmentNode: {
+            # generic LaTeX environments
+            'abstract': '',
+            'quote': '',
+            'quotation': '',
+
+            'notice': 'Q',
+            'seealso': '',
+            'seealso*': '',
+        },
+
+        DescEnvironmentNode: {
+            # information units
+            'datadesc': 'T',
+            'datadescni': 'T',
+            'excclassdesc': 'TM',
+            'excdesc': 'T',
+            'funcdesc': 'TM',
+            'funcdescni': 'TM',
+            'classdesc': 'TM',
+            'classdesc*': 'T',
+            'memberdesc': 'QT',
+            'memberdescni': 'QT',
+            'methoddesc': 'QMM',
+            'methoddescni': 'QMM',
+            'opcodedesc': 'TT',
+
+            'cfuncdesc': 'MTM',
+            'cmemberdesc': 'TTT',
+            'csimplemacrodesc': 'T',
+            'ctypedesc': 'QT',
+            'cvardesc': 'TT',
+        },
+    }
+
+    # ------------------------- special handlers -----------------------------
+
+    def handle_special_command(self, cmdname):
+        if cmdname in '{}%$^#&_ ':
+            # these are just escapes for special LaTeX commands
+            return TextNode(cmdname)
+        elif cmdname in '\'`~"c':
+            # accents and umlauts
+            nextl, nextt, nextv, nextr = self.tokens.next()
+            if nextt == 'bgroup':
+                _, nextt, _, _ = self.tokens.next()
+                if nextt != 'egroup':
+                    raise ParserError('wrong argtype for \\%s' % cmdname, nextl)
+                return TextNode(cmdname)
+            if nextt != 'text':
+                # not nice, but {\~} = ~
+                self.tokens.push((nextl, nextt, nextv, nextr))
+                return TextNode(cmdname)
+            c = umlaut(cmdname, nextv[0])
+            self.tokens.push((nextl, nextt, nextv[1:], nextr[1:]))
+            return TextNode(c)
+        elif cmdname == '\\':
+            return BreakNode()
+        raise ParserError('no handler for \\%s command' % cmdname,
+                          self.tokens.peek()[0])
+
+    def handle_begin(self):
+        envname, = self.parse_args('begin', 'T')
+        handler = getattr(self, 'handle_%s_env' % envname.text, None)
+        if not handler:
+            raise ParserError('no handler for %s environment' % envname.text,
+                              self.tokens.peek()[0])
+        return handler()
+
+    # ------------------------- command handlers -----------------------------
+
+    def mk_metadata_handler(self, name, mdname=None):
+        if mdname is None:
+            mdname = name
+        def handler(self):
+            data, = self.parse_args('\\'+name, 'M')
+            self.rootnode.params[mdname] = data
+            return EmptyNode()
+        return handler
+
+    handle_title = mk_metadata_handler(None, 'title')
+    handle_author = mk_metadata_handler(None, 'author')
+    handle_authoraddress = mk_metadata_handler(None, 'authoraddress')
+    handle_date = mk_metadata_handler(None, 'date')
+    handle_release = mk_metadata_handler(None, 'release')
+    handle_setshortversion = mk_metadata_handler(None, 'setshortversion',
+                                                 'shortversion')
+    handle_setreleaseinfo = mk_metadata_handler(None, 'setreleaseinfo',
+                                                'releaseinfo')
+
+    def handle_note(self):
+        note = self.parse_args('\\note', 'M')[0]
+        return EnvironmentNode('notice', [TextNode('note')], note)
+
+    def handle_warning(self):
+        warning = self.parse_args('\\warning', 'M')[0]
+        return EnvironmentNode('notice', [TextNode('warning')], warning)
+
+    def handle_ifx(self):
+        for l, t, v, r in self.tokens:
+            if t == 'command' and v == 'fi':
+                break
+        return EmptyNode()
+
+    def handle_c(self):
+        return self.handle_special_command('c')
+
+    def handle_mbox(self):
+        return self.parse_args('\\mbox', 'M')[0]
+
+    def handle_leftline(self):
+        return self.parse_args('\\leftline', 'M')[0]
+
+    def handle_Large(self):
+        return self.parse_args('\\Large', 'M')[0]
+
+    def handle_pytype(self):
+        # \pytype{x} is synonymous to \class{x} now
+        return self.handle_class()
+
+    def handle_nodename(self):
+        return self.handle_label()
+
+    def handle_verb(self):
+        # skip delimiter
+        l, t, v, r = self.tokens.next()
+        l, t, v, r = self.tokens.next()
+        assert t == 'text'
+        node = InlineNode('code', [TextNode(r)])
+        # skip delimiter
+        l, t, v, r = self.tokens.next()
+        return node
+
+    def handle_locallinewidth(self):
+        return EmptyNode()
+
+    def handle_linewidth(self):
+        return EmptyNode()
+
+    def handle_setlength(self):
+        self.parse_args('\\setlength', 'MM')
+        return EmptyNode()
+
+    def handle_stmodindex(self):
+        arg, = self.parse_args('\\stmodindex', 'T')
+        return CommandNode('declaremodule', [EmptyNode(),
+                                             TextNode(u'standard'),
+                                             arg])
+
+    def handle_indexname(self):
+        return EmptyNode()
+
+    def handle_renewcommand(self):
+        self.parse_args('\\renewcommand', 'MM')
+        return EmptyNode()
+
+    # ------------------------- environment handlers -------------------------
+
+    def handle_document_env(self):
+        return self.parse_until(self.environment_end)
+
+    handle_sloppypar_env = handle_document_env
+    handle_flushleft_env = handle_document_env
+    handle_math_env = handle_document_env
+
+    def handle_verbatim_env(self):
+        text = []
+        for l, t, v, r in self.tokens:
+            if t == 'command' and v == 'end' :
+                tok = self.tokens.peekmany(3)
+                if tok[0][1] == 'bgroup' and \
+                   tok[1][1] == 'text' and \
+                   tok[1][2] == 'verbatim' and \
+                   tok[2][1] == 'egroup':
+                    self.tokens.popmany(3)
+                    break
+            text.append(r)
+        return VerbatimNode(TextNode(''.join(text)))
+
+    # involved math markup must be corrected manually
+    def handle_displaymath_env(self):
+        text = ['XXX: translate this math']
+        for l, t, v, r in self.tokens:
+            if t == 'command' and v == 'end' :
+                tok = self.tokens.peekmany(3)
+                if tok[0][1] == 'bgroup' and \
+                   tok[1][1] == 'text' and \
+                   tok[1][2] == 'displaymath' and \
+                   tok[2][1] == 'egroup':
+                    self.tokens.popmany(3)
+                    break
+            text.append(r)
+        return VerbatimNode(TextNode(''.join(text)))
+
+    # alltt is different from verbatim because it allows markup
+    def handle_alltt_env(self):
+        nodelist = NodeList()
+        for l, t, v, r in self.tokens:
+            if self.environment_end(t, v):
+                break
+            if t == 'command':
+                if len(v) == 1 and not v.isalpha():
+                    nodelist.append(self.handle_special_command(v))
+                    continue
+                handler = getattr(self, 'handle_' + v, None)
+                if not handler:
+                    raise ParserError('no handler for \\%s command' % v, l)
+                nodelist.append(handler())
+            elif t == 'comment':
+                nodelist.append(CommentNode(v))
+            else:
+                # all else is appended raw
+                nodelist.append(TextNode(r))
+        return VerbatimNode(nodelist.flatten())
+
+    def handle_itemize_env(self, nodetype=ItemizeNode):
+        items = []
+        # a usecase for nonlocal :)
+        running = [False]
+
+        def item_condition(t, v, bracelevel):
+            if self.environment_end(t, v):
+                del running[:]
+                return True
+            if t == 'command' and v == 'item':
+                return True
+            return False
+
+        # the text until the first \item is discarded
+        self.parse_until(item_condition)
+        while running:
+            itemname, = self.parse_args('\\item', 'O')
+            itemcontent = self.parse_until(item_condition)
+            items.append([itemname, itemcontent])
+        return nodetype(items)
+
+    def handle_enumerate_env(self):
+        return self.handle_itemize_env(EnumerateNode)
+
+    def handle_description_env(self):
+        return self.handle_itemize_env(DescriptionNode)
+
+    def handle_definitions_env(self):
+        items = []
+        running = [False]
+
+        def item_condition(t, v, bracelevel):
+            if self.environment_end(t, v):
+                del running[:]
+                return True
+            if t == 'command' and v == 'term':
+                return True
+            return False
+
+        # the text until the first \item is discarded
+        self.parse_until(item_condition)
+        while running:
+            itemname, = self.parse_args('\\term', 'M')
+            itemcontent = self.parse_until(item_condition)
+            items.append([itemname, itemcontent])
+        return DefinitionsNode(items)
+
+    def mk_table_handler(self, envname, numcols):
+        def handle_table(self):
+            args = self.parse_args('table'+envname, 'TT' + 'M'*numcols)
+            firstcolformat = args[1].text
+            headings = args[2:]
+            lines = []
+            for l, t, v, r in self.tokens:
+                # XXX: everything outside of \linexxx is lost here
+                if t == 'command':
+                    if v == 'line'+envname:
+                        lines.append(self.parse_args('\\line'+envname,
+                                                     'M'*numcols))
+                    elif v == 'end':
+                        arg = self.parse_args('\\end', 'T')
+                        assert arg[0].text.endswith('table'+envname), arg[0].text
+                        break
+            for line in lines:
+                if not empty(line[0]):
+                    line[0] = InlineNode(firstcolformat, [line[0]])
+            return TableNode(numcols, headings, lines)
+        return handle_table
+
+    handle_tableii_env = mk_table_handler(None, 'ii', 2)
+    handle_longtableii_env = handle_tableii_env
+    handle_tableiii_env = mk_table_handler(None, 'iii', 3)
+    handle_longtableiii_env = handle_tableiii_env
+    handle_tableiv_env = mk_table_handler(None, 'iv', 4)
+    handle_longtableiv_env = handle_tableiv_env
+    handle_tablev_env = mk_table_handler(None, 'v', 5)
+    handle_longtablev_env = handle_tablev_env
+
+    def handle_productionlist_env(self):
+        env_args = self.parse_args('productionlist', 'Q')
+        items = []
+        for l, t, v, r in self.tokens:
+            # XXX: everything outside of \production is lost here
+            if t == 'command':
+                if v == 'production':
+                    items.append(self.parse_args('\\production', 'TM'))
+                elif v == 'productioncont':
+                    args = self.parse_args('\\productioncont', 'M')
+                    args.insert(0, EmptyNode())
+                    items.append(args)
+                elif v == 'end':
+                    arg = self.parse_args('\\end', 'T')
+                    assert arg[0].text == 'productionlist'
+                    break
+        node = ProductionListNode(items)
+        # the argument specifies a production group
+        node.arg = env_args[0]
+        return node
+
+    def environment_end(self, t, v, bracelevel=0):
+        if t == 'command' and v == 'end':
+            self.parse_args('\\end', 'T')
+            return True
+        return False

converter/newfiles/TODO

+To do after conversion
+======================
+
+* fix all references and links marked with `XXX`
+* adjust all literal include paths
+* remove all non-literal includes
+* fix all duplicate labels and undefined label references
+* fix the email package docs: add a toctree
+* split very large files and add toctrees
+* integrate standalone HOWTOs
+* find out which files get "comments disabled" metadata
+* double backslashes in production lists
+* add synopses for each module
+* write "About these documents"
+* finish "Documenting Python"
+* extend copyright.rst
+* merge ACKS into about.rst
+* fix the "quadruple" index term

converter/newfiles/about.rst

+=====================
+About these documents
+=====================
+
+These documents are generated from `reStructuredText
+<http://docutils.sf.net/rst.html>`_ sources by *Sphinx*, a document processor
+specifically written for the Python documentation.
+
+In the online version of these documents, you can submit comments and suggest
+changes directly on the documentation pages.
+
+Development of the documentation and its toolchain takes place on the
+docs@python.org mailing list.  We're always looking for volunteers wanting
+to help with the docs, so feel free to send a mail there!
+
+See :ref:`reporting-bugs` for information how to report bugs in Python itself.

converter/newfiles/api_index.rst

+.. _c-api-index:
+
+##################################
+  Python/C API Reference Manual
+##################################
+
+:Release: |version|
+:Date: |today|
+
+This manual documents the API used by C and C++ programmers who want to write
+extension modules or embed Python.  It is a companion to :ref:`extending-index`,
+which describes the general principles of extension writing but does not
+document the API functions in detail.
+
+.. warning::
+
+   The current version of this document is somewhat incomplete. However, most of
+   the important functions, types and structures are described.
+
+
+.. toctree::
+   :maxdepth: 2
+
+   intro.rst
+   veryhigh.rst
+   refcounting.rst
+   exceptions.rst
+   utilities.rst
+   abstract.rst
+   concrete.rst
+   init.rst
+   memory.rst
+   newtypes.rst