+Title: reStructuredText Standard Docstring Format
+Author: email@example.com (David Goodger)
+ This PEP proposes that the reStructuredText _ markup be adopted
+ as the standard markup format for plaintext documentation in
+ Python docstrings, and (optionally) for PEPs and ancillary
+ documents as well. reStructuredText is a rich and extensible yet
+ easy-to-read, what-you-see-is-what-you-get plaintext markup
+ Only the low-level syntax of docstrings is addressed here. This
+ PEP is not concerned with docstring semantics or processing at
+ These are the generally accepted goals for a docstring format, as
+ discussed in the Python Documentation Special Interest Group
+ 1. It must be easy to type with any standard text editor.
+ 2. It must be readable to the casual observer.
+ 3. It must not need to contain information which can be deduced
+ from parsing the module.
+ 4. It must contain sufficient information (structure) so it can be
+ converted to any reasonable markup format.
+ 5. It must be possible to write a module's entire documentation in
+ docstrings, without feeling hampered by the markup language.
+ [[Are these in fact the goals of the Doc-SIG members? Anything to
+ reStructuredText meets and exceeds all of these goals, and sets
+ its own goals as well, even more stringent. See "Features" below.
+ The goals of this PEP are as follows:
+ 1. To establish a standard docstring format by attaining
+ "accepted" status (Python community consensus; BDFL
+ pronouncement). Once reStructuredText is a Python standard,
+ all effort can be focused on tools instead of arguing for a
+ standard. Python needs a standard set of documentation tools.
+ 2. To address any related concerns raised by the Python community.
+ 3. To encourage community support. As long as multiple competing
+ markups are out there, the development community remains
+ fractured. Once a standard exists, people will start to use
+ it, and momentum will inevitably gather.
+ 4. To consolidate efforts from related auto-documentation
+ projects. It is hoped that interested developers will join
+ forces and work on a joint/merged/common implementation.
+ 5. (Optional.) To adopt reStructuredText as the standard markup
+ for PEPs. One or both of the following strategies may be
+ a) Keep the existing PEP section structure constructs (one-line
+ section headers, indented body text). Subsections can
+ either be forbidden or supported with underlined headers in
+ the indented body text.
+ b) Replace the PEP section structure constructs with the
+ reStructuredText syntax. Section headers will require
+ underlines, subsections will be supported out of the box,
+ and body text need not be indented (except for block
+ Support for RFC 2822 headers will be added to the
+ reStructuredText parser (unambiguous given a specific context:
+ the first contiguous block of a PEP document). It may be
+ desired to concretely specify what over/underline styles are
+ allowed for PEP section headers, for uniformity.
+ 6. (Optional.) To adopt reStructuredText as the standard markup
+ for README-type files and other standalone documents in the
+ The __doc__ attribute is called a documentation string, or
+ docstring. It is often used to summarize the interface of the
+ module, class or function. The lack of a standard syntax for
+ docstrings has hampered the development of standard tools for
+ extracting docstrings and transforming them into documentation in
+ standard formats (e.g., HTML, DocBook, TeX). There have been a
+ number of proposed markup formats and variations, and many tools
+ tied to these proposals, but without a standard docstring format
+ they have failed to gain a strong following and/or floundered
+ The adoption of a standard will, at the very least, benefit
+ docstring processing tools by preventing further "reinventing the
+ Throughout the existence of the Doc-SIG, consensus on a single
+ standard docstring format has never been reached. A lightweight,
+ implicit markup has been sought, for the following reasons (among
+ 1. Docstrings written within Python code are available from within
+ the interactive interpreter, and can be 'print'ed. Thus the
+ use of plaintext for easy readability.
+ 2. Programmers want to add structure to their docstrings, without
+ sacrificing raw docstring readability. Unadorned plaintext
+ cannot be transformed ('up-translated') into useful structured
+ 3. Explicit markup (like XML or TeX) is widely considered
+ unreadable by the uninitiated.
+ 4. Implicit markup is aesthetically compatible with the clean and
+ minimalist Python syntax.
+ Proposed alternatives have included:
+ - XML _, SGML _, DocBook _, HTML _, XHTML _
+ XML and SGML are explicit, well-formed meta-languages suitable
+ for all kinds of documentation. XML is a variant of SGML. They
+ are best used behind the scenes, because they are verbose,
+ difficult to type, and too cluttered to read comfortably as
+ source. DocBook, HTML, and XHTML are all applications of SGML
+ and/or XML, and all share the same basic syntax and the same
+ TeX is similar to XML/SGML in that it's explicit, not very easy
+ to write, and not easy for the uninitiated to read.
+ Most Perl modules are documented in a format called POD -- Plain
+ Old Documentation. This is an easy-to-type, very low level
+ format with strong integration with the Perl parser. Many tools
+ exist to turn POD documentation into other formats: info, HTML
+ and man pages, among others. However, the POD syntax takes
+ after Perl itself in terms of readability.
+ Special comments before Java classes and functions serve to
+ document the code. A program to extract these, and turn them
+ into HTML documentation is called javadoc, and is part of the
+ standard Java distribution. However, the only output format
+ that is supported is HTML, and JavaDoc has a very intimate
+ relationship with HTML, using HTML tags for most markup. Thus
+ it shares the readability problems of HTML.
+ - Setext _, StructuredText _
+ Early on, variants of Setext (Structure Enhanced Text),
+ including Zope Corp's StructuredText, were proposed for Python
+ docstring formatting. Hereafter these variants will
+ collectively be call 'STexts'. STexts have the advantage of
+ being easy to read without special knowledge, and relatively
+ Although used by some (including in most existing Python
+ auto-documentation tools), until now STexts have failed to
+ become standard because:
+ - STexts have been incomplete. Lacking "essential" constructs
+ that people want to use in their docstrings, STexts are
+ rendered less than ideal. Note that these "essential"
+ constructs are not universal; everyone has their own
+ - STexts have been sometimes surprising. Bits of text are
+ marked up unexpectedly, leading to user frustration.
+ - SText implementations have been buggy.
+ - Most STexts have have had no formal specification except for
+ the implementation itself. A buggy implementation meant a
+ buggy spec, and vice-versa.
+ - There has been no mechanism to get around the SText markup
+ rules when a markup character is used in a non-markup context.
+ Proponents of implicit STexts have vigorously opposed proposals
+ for explicit markup (XML, HTML, TeX, POD, etc.), and the debates
+ have continued off and on since 1996 or earlier.
+ reStructuredText is a complete revision and reinterpretation of
+ the SText idea, addressing all of the problems listed above.
+ Rather than repeating or summarizing the extensive
+ reStructuredText spec, please read the originals available from
+ http://structuredtext.sourceforge.net/spec/ (.txt & .html files).
+ Reading the documents in following order is recommended:
+ - An Introduction to reStructuredText _
+ - Problems With StructuredText _ (optional, if you've used
+ StructuredText; it explains many markup decisions made)
+ - reStructuredText Markup Specification _
+ - A Record of reStructuredText Syntax Alternatives _ (explains
+ markup decisions made independently of StructuredText)
+ - reStructuredText Directives _
+ There is also a "Quick reStructuredText" user reference _.
+ A summary of features addressing often-raised docstring markup
+ - A markup escaping mechanism.
+ Backslashes (``\``) are used to escape markup characters when
+ needed for non-markup purposes. However, the inline markup
+ recognition rules have been constructed in order to minimize the
+ need for backslash-escapes. For example, although asterisks are
+ used for *emphasis*, in non-markup contexts such as "*" or "(*)"
+ or "x * y", the asterisks are not interpreted as markup and are
+ left unchanged. For many non-markup uses of backslashes (e.g.,
+ describing regular expressions), inline literals or literal
+ blocks are applicable; see the next item.
+ - Markup to include Python source code and Python interactive
+ sessions: inline literals, literal blocks, and doctest blocks.
+ Inline literals use ``double-backquotes`` to indicate program
+ I/O or code snippets. No markup interpretation (including
+ backslash-escape [``\``] interpretation) is done within inline
+ Literal blocks (block-level literal text, such as code excerpts
+ or ASCII graphics) are indented, and indicated with a
+ double-colon ("::") at the end of the preceding paragraph (right
+ spaces_and_linebreaks = 'are preserved'
+ markup_processing = None
+ Doctest blocks begin with ">>> " and end with a blank line.
+ Neither indentation nor literal block double-colons are
+ required. For example::
+ Here's a doctest block:
+ >>> print 'Python-specific usage examples; begun with ">>>"'
+ Python-specific usage examples; begun with ">>>"
+ >>> print '(cut and pasted from interactive sessions)'
+ (cut and pasted from interactive sessions)
+ - Markup that isolates a Python identifier: interpreted text.
+ Text enclosed in single backquotes is recognized as "interpreted
+ text", whose interpretation is application-dependent. In the
+ context of a Python docstring, the default interpretation of
+ interpreted text is as Python identifiers. The text will be
+ marked up with a hyperlink connected to the documentation for
+ the identifier given. Lookup rules are the same as in Python
+ itself: LGB namespace lookups (local, global, builtin). The
+ "role" of the interpreted text (identifying a class, module,
+ function, etc.) is determined implicitly from the namespace
+ Extend `Storer`. Class attribute `instances` keeps track
+ of the number of `Keeper` objects instantiated.
+ """How many `Keeper` objects are there?"""
+ Extend `Storer.__init__()` to keep track of
+ instances. Keep count in `self.instances` and data
+ """Store data in a list, most recent last."""
+ def storedata(self, data):
+ Extend `Storer.storedata()`; append new `data` to a
+ Each piece of interpreted text is looked up according to the
+ local namespace of the block containing its docstring.
+ - Markup that isolates a Python identifier and specifies its type:
+ interpreted text with roles.
+ Although the Python source context reader is designed not to
+ require explicit roles, they may be used. To classify
+ identifiers explicitly, the role is given along with the
+ identifier in either prefix or suffix form::
+ Use :method:`Keeper.storedata` to store the object's data in
+ The syntax chosen for roles is verbose, but necessarily so (if
+ anyone has a better alternative, please post it to the Doc-SIG).
+ The intention of the markup is that there should be little need
+ to use explicit roles; their use is to be kept to an absolute
+ - Markup for "tagged lists" or "label lists": field lists.
+ Field lists represent a mapping from field name to field body.
+ These are mostly used for extension syntax, such as
+ "bibliographic field lists" (representing document metadata such
+ as author, date, and version) and extension attributes for
+ directives (see below). They may be used to implement docstring
+ semantics, such as identifying parameters, exceptions raised,
+ etc.; such usage is beyond the scope of this PEP.
+ A modified RFC 2822 syntax is used, with a colon *before* as
+ well as *after* the field name. Field bodies are more versatile
+ as well; they may contain multiple field bodies (even nested
+ field lists). For example::
+ Standard RFC 2822 header syntax cannot be used for this
+ construct because it is ambiguous. A word followed by a colon
+ at the beginning of a line is common in written text. However,
+ with the addition of a well-defined context, such as when a
+ field list invariably occurs at the beginning of a document
+ (e.g., PEPs and email messages), standard RFC 2822 header syntax
+ - Markup extensibility: directives and substitutions.
+ Directives are used as an extension mechanism for
+ reStructuredText, a way of adding support for new block-level
+ constructs without adding new syntax. Directives for images,
+ admonitions (note, caution, etc.), and tables of contents
+ generation (among others) have been implemented. For example,
+ here's how to place an image::
+ Substitution definitions allow the power and flexibility of
+ block-level directives to be shared by inline text. For
+ The |biohazard| symbol must be used on containers used to
+ dispose of medical waste.
+ .. |biohazard| image:: biohazard.png
+ - Section structure markup.
+ Section headers in reStructuredText use adornment via underlines
+ (and possibly overlines) rather than indentation. For example::
+ This is a Section Title
+ This is a Subsection Title
+ This paragraph is in the subsection.
+ This is Another Section Title
+ This paragraph is in the second section.
+ Q: Is reStructuredText rich enough?
+ A: Yes, it is for most people. If it lacks some construct that is
+ require for a specific application, it can be added via the
+ directive mechanism. If a common construct has been
+ overlooked and a suitably readable syntax can be found, it can
+ be added to the specification and parser.
+ Q: Is reStructuredText *too* rich?
+ Since the very beginning, whenever a markup syntax has been
+ proposed on the Doc-SIG, someone has complained about the lack
+ of support for some construct or other. The reply was often
+ something like, "These are docstrings we're talking about, and
+ docstrings shouldn't have complex markup." The problem is that
+ a construct that seems superfluous to one person may be
+ absolutely essential to another.
+ reStructuredText takes the opposite approach: it provides a
+ rich set of implicit markup constructs (plus a generic
+ extension mechanism for explicit markup), allowing for all
+ kinds of documents. If the set of constructs is too rich for a
+ particular application, the unused constructs can either be
+ removed from the parser (via application-specific overrides) or
+ simply omitted by convention.
+ Q: Why not use indentation for section structure, like
+ StructuredText does? Isn't it more "Pythonic"?
+ A: Guido van Rossum wrote the following in a 2001-06-13 Doc-SIG
+ I still think that using indentation to indicate sectioning
+ is wrong. If you look at how real books and other print
+ publications are laid out, you'll notice that indentation
+ is used frequently, but mostly at the intra-section level.
+ Indentation can be used to offset lists, tables,
+ quotations, examples, and the like. (The argument that
+ docstrings are different because they are input for a text
+ formatter is wrong: the whole point is that they are also
+ readable without processing.)
+ I reject the argument that using indentation is Pythonic:
+ text is not code, and different traditions and conventions
+ hold. People have been presenting text for readability for
+ over 30 centuries. Let's not innovate needlessly.
+ See "Section Structure via Indentation" in "Problems With
+ StructuredText" [14 ]_ for further elaboration.
+ Q: Why use reStructuredText for PEPs? What's wrong with the
+ A: The existing standard for PEPs is very limited in terms of
+ general expressibility, and referencing is especially lacking
+ for such a reference-rich document type. PEPs are currently
+ converted into HTML, but the results (mostly monospaced text)
+ are less than attractive, and most of the value-added potential
+ Making reStructuredText the standard markup for PEPs will
+ enable much richer expression, including support for section
+ structure, inline markup, graphics, and tables. In several
+ PEPs there are ASCII graphics diagrams, which are all that
+ plaintext documents can support. Since PEPs are made available
+ in HTML form, the ability to include proper diagrams would be
+ Current PEP practices allow for reference markers in the form
+ "" in the text, and the footnotes/references themselves are
+ listed in a section toward the end of the document. There is
+ currently no hyperlinking between the reference marker and the
+ footnote/reference itself (it would be possible to add this to
+ pep2html.py, but the "markup" as it stands is ambiguous and
+ mistakes would be inevitable). A PEP with many references
+ (such as this one ;-) requires a lot of flipping back and
+ forth. When revising a PEP, often new references are added or
+ unused references deleted. It is painful to renumber the
+ references, since it has to be done in two places and can have
+ a cascading effect (insert a single new reference 1, and every
+ other reference has to be renumbered; always adding new
+ references to the end is suboptimal). It is easy for
+ references to go out of sync.
+ PEPs use references for two purposes: simple URL references and
+ footnotes. reStructuredText differentiates between the two. A
+ PEP might contain references like this::
+ This PEP proposes a adding frungible doodads  to the
+ core. It extends PEP 9876  via the BCA 
+ References and Footnotes
+  http://www.doodads.org/frungible.html
+  PEP 9876, Let's Hope We Never Get Here
+  "Bogus Complexity Addition"
+ Reference 1 is a simple URL reference. Reference 2 is a
+ footnote containing text and a URL. Reference 3 is a footnote
+ containing text only. Rewritten using reStructuredText, this
+ PEP could look like this::
+ This PEP proposes a adding `frungible doodads`_ to the
+ core. It extends PEP 9876 [#pep9876] via the BCA [#]
+ .. [#pep9876] `PEP 9876`__, Let's Hope We Never Get Here
+ __ http://www.python.org/peps/pep-9876.html
+ .. [#] "Bogus Complexity Addition"
+ URLs and footnotes can be defined close to their references if
+ desired, making them easier to read in the source text, and
+ making the PEPs easier to revise. The "References and
+ Footnotes" section can be auto-generated with a document tree
+ transform. Footnotes from throughout the PEP would be gathered
+ and displayed under a standard header. If URL references
+ should likewise be written out explicitly (in citation form),
+ another tree transform could be used.
+ URL references can be named ("frungible doodads"), and can be
+ referenced from multiple places in the document without
+ additional definitions. When converted to HTML, references
+ will be replaced with inline hyperlinks (HTML <A> tags). The
+ two footnotes are automatically numbered, so they will always
+ stay in sync. The first footnote also contains an internal
+ reference name, "pep9876", so it's easier to see the connection
+ between reference and footnote in the source text. Named
+ footnotes can be referenced multiple times, maintaining
+ The "#pep9876" footnote could also be written in the form of a
+ It extends PEP 9876 [PEP9876]_ ...
+ .. [PEP9876] `PEP 9876`_, Let's Hope We Never Get Here
+ Footnotes are numbered, whereas citations use text for their
+ Q: Wouldn't it be better to keep the docstring and PEP proposals
+ A: The PEP markup proposal is an option to this PEP. It may be
+ removed if it is deemed that there is no need for PEP markup.
+ The PEP markup proposal could be made into a separate PEP if
+ necessary. If accepted, PEP 1, PEP Purpose and Guidelines _,
+ and PEP 9, Sample PEP Template _ will be updated.
+ It seems natural to adopt a single consistent markup standard
+ for all uses of plaintext in Python.
+ Q: The existing pep2html.py script converts the existing PEP
+ format to HTML. How will the new-format PEPs be converted to
+ A: One of the deliverables of the Docutils project _ will be a
+ new version of pep2html.py with integrated reStructuredText
+ parsing. The Docutils project will support PEPs with a "PEP
+ Reader" component, including all functionality currently in
+ pep2html.py (auto-recognition of PEP & RFC references).
+ Q: Who's going to convert the existing PEPs to reStructuredText?
+ A: A call for volunteers will be put out to the Doc-SIG and
+ greater Python communities. If insufficient volunteers are
+ forthcoming, I (David Goodger) will convert the documents
+ myself, perhaps with some level of automation. A transitional
+ system whereby both old and new standards can coexist will be
+ easy to implement (and I pledge to implement it if necessary).
+ Q: Why use reStructuredText for README and other ancillary files?
+ A: The same reasoning used for PEPs above applies to README and
+ other ancillary files. By adopting a standard markup, these
+ files can be converted to attractive cross-referenced HTML and
+ put up on python.org. Developers of Python projects can also
+ take advantage of this facility for their own documentation.
+References and Footnotes
+  http://structuredtext.sourceforge.net/
+  http://www.python.org/sigs/doc-sig/
+  http://www.w3.org/XML/
+  http://www.oasis-open.org/cover/general.html
+  http://docbook.org/tdg/en/html/docbook.html
+  http://www.w3.org/MarkUp/
+  http://www.w3.org/MarkUp/#xhtml1
+  http://www.tug.org/interest.html
+  http://www.perldoc.com/perl5.6/pod/perlpod.html
+  http://java.sun.com/j2se/javadoc/
+  http://docutils.sourceforge.net/mirror/setext.html
+  http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage
+  An Introduction to reStructuredText
+  Problems with StructuredText
+  reStructuredText Markup Specification
+  A Record of reStructuredText Syntax Alternatives
+  reStructuredText Directives
+  Quick reStructuredText
+  PEP 1, PEP Guidelines, Warsaw, Hylton
+  PEP 9, Sample PEP Template, Warsaw
+  http://docutils.sourceforge.net/
+  PEP 216, Docstring Format, Zadka
+ This document has been placed in the public domain.
+ Some text is borrowed from PEP 216, Docstring Format, by Moshe
+ Special thanks to all members past & present of the Python Doc-SIG.