Anonymous avatar Anonymous committed 5e8758f

[svn] Quite a few things:
- new pygmentize options, -F for filters and -H for detail help.
- an automatically-created formatter map
- better HTML formatter subclassing

Comments (0)

Files changed (31)

 ----------------------------
 (codename to be selected, released Feb XX, 2007)
 
+- Changed the exception raised if no suitable lexer, formatter etc. is
+  found in one of the `get_*_by_*` functions to a custom exception,
+  `pygments.util.ClassNotFound`. It is, however, a subclass of `ValueError`
+  in order to retain backwards compatibility.
+
+- Added a `-H` command line option which can be used to get the docstring
+  of a lexer, formatter or filter.
+
+- Made the handling of lexers and formatters more consistent. The aliases
+  and filename patterns of formatters are now attributes on them. 
+
 - Added an OCaml lexer, thanks to Adam Blinkinsop.
 
 - Made the HTML formatter more flexible, and easily subclassable in order
   to make it easy to implement custom wrappers, e.g. alternate line
-  number markup.  
+  number markup. See the documentation.
 
 - Added an `outencoding` option to all formatters, making it possible
   to override the `encoding` (which is used by lexers and formatters) when
 
 - Added sources.list lexer by Dennis Kaarsemaker.
 
-- Added token stream filters.
+- Added token stream filters, and a pygmentize option to use them.
 
 - Changed behavior of `in` Operator for tokens. 
 
 
 export PYTHONPATH = $(shell echo "$$PYTHONPATH"):$(shell python -c 'print ":".join(line.strip() for line in file("PYTHONPATH"))' 2>/dev/null)
 
-.PHONY: all apidocs check clean clean-pyc codetags docs epydoc lexermap \
+.PHONY: all apidocs check clean clean-pyc codetags docs epydoc mapfiles \
 	pylint reindent test
 
 all: clean-pyc check test
 			apidocs/*.html
 	@$(PYTHON) scripts/fix_epydoc_markup.py apidocs
 
-lexermap:
-	cd pygments/lexers; $(PYTHON) _mapping.py
+mapfiles:
+	(cd pygments/lexers; $(PYTHON) _mapping.py)
+	(cd pygments/formatters; $(PYTHON) _mapping.py)
 
 pylint:
 	@pylint --rcfile scripts/pylintrc pygments
 for 0.7
 -------
 
-- new lexers:
-  * Haskell
-  * Lisp
-  * IPython sessions
-  * HTML with special formatting?
-  * LaTeX special formatting?
-  * Nemerle
-  * Assembler
-  * Objective C
-  * MySQL/PostgreSQL/SQLite
-  * Tcl
+- a MoinMoin parser
 
-- lexers that need review:
+- more unit tests (test pygmentize, test all formatters comprehensively)
+
+
+new lexers
+----------
+
+* Haskell
+* Lisp
+* IPython sessions
+* HTML with special formatting?
+* LaTeX special formatting?
+* Nemerle
+* Assembler
+* Objective C
+* MySQL/PostgreSQL/SQLite
+* Tcl
+
+
+for 0.8 -- 1.0
+--------------
+
+- lexers that need work:
   * review perl lexer (numerous bugs, but so far no one had complaints ;)
   * readd property support for C# lexer? that is, find a regex that doesn't
     backtrack to death...
   * add support for function name highlighting to C++ lexer
 
-- make it possible to use filters from the command line
-
-- automatically get help for lexers/formatters/options from docstrings
-
-- a MoinMoin parser
+- add folding? would require more language-aware parsers...
 
 - allow "overlay" token types to highlight specials: nth line, a word etc.
 
 - pygmentize option presets, more sophisticated method to output styles?
-
-- more unit tests (test pygmentize, test all formatters comprehensively)
-
-
-for 0.8 -- 1.0
---------------
-
-- add folding? would require more language-aware parsers...
     aliases list. The lexer is given the `options` at its
     instantiation.
 
-    Will raise `ValueError` if no lexer with that alias is found.
+    Will raise `pygments.util.ClassNotFound` if no lexer with that alias is
+    found.
 
 def `get_lexer_for_filename(fn, **options):`
     Return a `Lexer` subclass instance that has a filename pattern
     matching `fn`. The lexer is given the `options` at its
     instantiation.
 
-    Will raise `ValueError` if no lexer for that filename is found.
+    Will raise `pygments.util.ClassNotFound` if no lexer for that filename is
+    found.
 
 def `get_lexer_for_mimetype(mime, **options):`
     Return a `Lexer` subclass instance that has `mime` in its mimetype
     list. The lexer is given the `options` at its instantiation.
 
-    Will raise `ValueError` if not lexer for that mimetype is found.
+    Will raise `pygments.util.ClassNotFound` if not lexer for that mimetype is
+    found.
 
 def `guess_lexer(text, **options):`
     Return a `Lexer` subclass instance that's guessed from the text
     lexer class is called with the text as argument, and the lexer
     which returned the highest value will be instantiated and returned.
 
-    `ValueError` is raised if no lexer thinks it can handle the content.
+    `pygments.util.ClassNotFound` is raised if no lexer thinks it can handle the
+    content.
 
 def `guess_lexer_for_filename(text, filename, **options):`
     As `guess_lexer()`, but only lexers which have a pattern in `filenames`
     or `alias_filenames` that matches `filename` are taken into consideration.
     
-    `ValueError` is raised if no lexer thinks it can handle the content.
+    `pygments.util.ClassNotFound` is raised if no lexer thinks it can handle the
+    content.
 
 def `get_all_lexers():`
     Return an iterable over all registered lexers, yielding tuples in the
     aliases list. The formatter is given the `options` at its
     instantiation.
 
-    Will raise `ValueError` if no formatter with that alias is found.
+    Will raise `pygments.util.ClassNotFound` if no formatter with that alias is
+    found.
 
 def `get_formatter_for_filename(fn, **options):`
     Return a `Formatter` subclass instance that has a filename pattern
     matching `fn`. The formatter is given the `options` at its
     instantiation.
 
-    Will raise `ValueError` if no formatter for that filename is found.
+    Will raise `pygments.util.ClassNotFound` if no formatter for that filename
+    is found.
 
 
 Functions from `pygments.styles`:
     Return a style class by its short name. The names of the builtin styles
     are listed in `pygments.styles.STYLE_MAP`.
 
-    Will raise `ValueError` if no style of that name is found.
+    Will raise `pygments.util.ClassNotFound` if no style of that name is found.
 
 def `get_all_styles():`
     Return an iterable over all registered styles, yielding their names.

docs/src/cmdline.txt

 
     $ pygmentize -f html -O style=colorful,linenos=1 -l python test.py
 
-Be sure to enclose the option string in quotes if it contains any special
-shell characters, such as spaces or expansion wildcards like ``*``.
+Be sure to enclose the option string in quotes if it contains any special shell
+characters, such as spaces or expansion wildcards like ``*``. If an option
+expects a list value, separate the list entries with spaces (you'll have to
+quote the option value in this case too, so that the shell doesn't split it).
+
+Filters are added to the token stream using the ``-F`` option::
+
+    $ pygmentize -f html -l pascal -F keywordcase:case=upper main.pas
+
+As you see, options for the filter are given after a colon. As for ``-O``, the
+filter name and options must be one shell word, so there may not be any spaces
+around the colon.
 
 There's a special ``-S`` option for generating style definitions. Usage is
 as follows::
 For an explanation what ``-a`` means for `a particular formatter`_, look for
 the `arg` argument for the formatter's `get_style_defs()` method.
 
-The ``-L`` option lists all lexers and formatters, along with their short
-names and supported file name extensions.
+The ``-L`` option lists lexers, formatters, along with their short
+names and supported file name extensions, styles and filters. If you want to see
+only one category, give it as an argument::
+
+    $ pygmentize -L filters
+
+will list only all installed filters.
+
+The ``-H`` option will give you detailed information (the same that can be found
+in this documentation) about a lexer, formatter or filter. Usage is as follows::
+
+    $ pygmentize -H formatter html
+
+will print the help for the HTML formatter, while::
+
+    $ pygmentize -H lexer python
+
+will print the help for the Python lexer, etc.
 
 
 .. _a particular formatter: formatters.txt

pygments/cmdline.py

 """
 import sys
 import getopt
+from textwrap import dedent
 
 from pygments import __version__, __author__, highlight
-from pygments.lexers import LEXERS, get_lexer_by_name, get_lexer_for_filename
-from pygments.util import OptionError
-from pygments.formatters import FORMATTERS, get_formatter_by_name, \
-     get_formatter_for_filename, TerminalFormatter
+from pygments.util import ClassNotFound, OptionError, docstring_headline
+from pygments.lexers import get_all_lexers, get_lexer_by_name, get_lexer_for_filename, \
+     find_lexer_class
+from pygments.formatters import get_all_formatters, get_formatter_by_name, \
+     get_formatter_for_filename, TerminalFormatter, find_formatter_class
+from pygments.filters import get_all_filters, find_filter_class
+from pygments.styles import get_all_styles, get_style_by_name
 
 
 USAGE = """\
-Usage: %s [-l <lexer>] [-f <formatter>] [-O <options>] [-o <outfile>] [<infile>]
+Usage: %s [-l <lexer>] [-F <filter>[:<options]] [-f <formatter>]
+          [-O <options>] [-o <outfile>] [<infile>]
+
        %s -S <style> -f <formatter> [-a <arg>] [-O <options>]
-       %s -L | -h | -V
+       %s -L [<which> ...]
+       %s -H <type> <name>
+       %s -h | -V
 
 Highlight the input file and write the result to <outfile>.
 
 With the -O option, you can give the lexer and formatter a comma-
 separated list of options, e.g. ``-O bg=light,python=cool``.
 
+With the -F option, you can add filters to the token stream, you can
+give options in the same way as for -O after a colon (note: there must
+not be spaces around the colon).
+
+The -O and -F options can be given multiple times.
+
 With the -S option, print out style definitions for style <style>
 for formatter <formatter>. The argument given by -a is formatter
 dependent.
 
-The -L option lists all available lexers and formatters.
+The -L option lists lexers, formatters, styles or filters -- set
+`which` to the thing you want to list (e.g. "styles"), or omit it to
+list everything.
+
+The -H option prints detailed help for the object <name> of type <type>,
+where <type> is one of "lexer", "formatter" or "filter".
+
 The -h option prints this help.
 The -V option prints the package version.
 """
 
 
-def _parse_options(o_str):
+def _parse_options(o_strs):
     opts = {}
-    if not o_str:
+    if not o_strs:
         return opts
-    o_args = o_str.split(',')
-    for o_arg in o_args:
-        o_arg = o_arg.strip()
-        try:
-            o_key, o_val = o_arg.split('=')
-            o_key = o_key.strip()
-            o_val = o_val.strip()
-        except ValueError:
-            opts[o_arg] = True
-        else:
-            opts[o_key] = o_val
+    for o_str in o_strs:
+        if not o_str:
+            continue
+        o_args = o_str.split(',')
+        for o_arg in o_args:
+            o_arg = o_arg.strip()
+            try:
+                o_key, o_val = o_arg.split('=')
+                o_key = o_key.strip()
+                o_val = o_val.strip()
+            except ValueError:
+                opts[o_arg] = True
+            else:
+                opts[o_key] = o_val
     return opts
 
 
-def _print_lflist():
-    # print version
-    main(['', '-V'])
+def _parse_filters(f_strs):
+    filters = []
+    if not f_strs:
+        return filters
+    for f_str in f_strs:
+        if ':' in f_str:
+            fname, fopts = f_str.split(':', 1)
+            filters.append((fname, _parse_options([fopts])))
+        else:
+            filters.append((f_str, {}))
+    return filters
 
-    print
-    print "Lexers:"
-    print "~~~~~~~"
 
-    info = []
-    for _, fullname, names, exts, _ in LEXERS.itervalues():
-        tup = (', '.join(names)+':', fullname,
-               exts and '(extensions ' + ', '.join(exts) + ')' or '')
-        info.append(tup)
-    info.sort()
-    for i in info:
-        print ('%s\n    %s %s') % i
+def _print_help(type, name):
+    try:
+        if type == 'lexer':
+            cls = find_lexer_class(name)
+            print "Help on the %s lexer:" % cls.name
+            print dedent(cls.__doc__)
+        elif type == 'formatter':
+            cls = find_formatter_class(name)
+            print "Help on the %s formatter:" % cls.name
+            print dedent(cls.__doc__)
+        elif type == 'filter':
+            cls = find_filter_class(name)
+            print "Help on the %s filter:" % name
+            print dedent(cls.__doc__)
+    except ClassNotFound:
+        print >>sys.stderr, "%s not found!" % type
 
-    print
-    print "Formatters:"
-    print "~~~~~~~~~~~"
 
-    info = []
-    for fullname, names, exts, doc in FORMATTERS.itervalues():
-        tup = (', '.join(names)+':', doc,
-               exts and '(extensions ' + ', '.join(exts) + ')' or '')
-        info.append(tup)
-    info.sort()
-    for i in info:
-        print ('%s\n    %s %s') % i
+def _print_list(what):
+    if what == 'lexer':
+        print
+        print "Lexers:"
+        print "~~~~~~~"
+
+        info = []
+        for fullname, names, exts, _ in get_all_lexers():
+            tup = (', '.join(names)+':', fullname,
+                   exts and '(filenames ' + ', '.join(exts) + ')' or '')
+            info.append(tup)
+        info.sort()
+        for i in info:
+            print ('* %s\n    %s %s') % i
+
+    elif what == 'formatter':
+        print
+        print "Formatters:"
+        print "~~~~~~~~~~~"
+
+        info = []
+        for cls in get_all_formatters():
+            doc = docstring_headline(cls)
+            tup = (', '.join(cls.aliases) + ':', doc,
+                   cls.filenames and '(filenames ' + ', '.join(cls.filenames) + ')' or '')
+            info.append(tup)
+        info.sort()
+        for i in info:
+            print ('* %s\n    %s %s') % i
+
+    elif what == 'filter':
+        print
+        print "Filters:"
+        print "~~~~~~~~"
+
+        for name in get_all_filters():
+            cls = find_filter_class(name)
+            print "* " + name + ':'
+            print "    %s" % docstring_headline(cls)
+
+    elif what == 'style':
+        print
+        print "Styles:"
+        print "~~~~~~~"
+        
+        for name in get_all_styles():
+            cls = get_style_by_name(name)
+            print "* " + name + ':'
+            print "    %s" % docstring_headline(cls) 
 
 
 def main(args):
     """
     Main command line entry point.
     """
-    usage = USAGE % ((args[0],) * 3)
+    usage = USAGE % ((args[0],) * 5)
 
     try:
-        opts, args = getopt.getopt(args[1:], "l:f:o:O:LhVS:a:")
-    except getopt.GetoptError:
+        popts, args = getopt.getopt(args[1:], "l:f:F:o:O:LS:a:hVH")
+    except getopt.GetoptError, err:
         print >>sys.stderr, usage
         return 2
-    opts = dict(opts)
+    opts = {}
+    O_opts = []
+    F_opts = []
+    for opt, arg in popts:
+        if opt == '-O':
+            O_opts.append(arg)
+        elif opt == '-F':
+            F_opts.append(arg)
+        opts[opt] = arg
 
     if not opts and not args:
         print usage
     # handle ``pygmentize -L``
     L_opt = opts.pop('-L', None)
     if L_opt is not None:
-        if opts or args:
+        if opts:
             print >>sys.stderr, usage
             return 2
 
-        _print_lflist()
+        # print version
+        main(['', '-V'])
+        if not args:
+            args = ['lexer', 'formatter', 'filter', 'style']
+        for arg in args:
+            _print_list(arg.rstrip('s'))
+        return 0
+
+    # handle ``pygmentize -H``
+    H_opt = opts.pop('-H', None)
+    if H_opt is not None:
+        if opts or len(args) != 2:
+            print >>sys.stderr, usage
+            return 2
+
+        type, name = args
+        if type not in ('lexer', 'formatter', 'filter'):
+            print >>sys.stderr, usage
+            return 2
+
+        _print_help(type, name)
         return 0
 
     # parse -O options
-    O_opts = _parse_options(opts.pop('-O', None))
+    O_opts = _parse_options(O_opts)
+    # parse -F options
+    F_opts = _parse_filters(F_opts)
 
     # handle ``pygmentize -S``
     S_opt = opts.pop('-S', None)
         try:
             O_opts['style'] = S_opt
             fmter = get_formatter_by_name(f_opt, **O_opts)
-        except ValueError, err:
+        except ClassNotFound, err:
             print >>sys.stderr, err
             return 1
 
     if fmter:
         try:
             fmter = get_formatter_by_name(fmter, **O_opts)
-        except (OptionError, ValueError), err:
+        except (OptionError, ClassNotFound), err:
             print >>sys.stderr, 'Error:', err
             return 1
 
         if not fmter:
             try:
                 fmter = get_formatter_for_filename(outfn, **O_opts)
-            except (OptionError, ValueError), err:
+            except (OptionError, ClassNotFound), err:
                 print >>sys.stderr, 'Error:', err
                 return 1
         try:
     if lexer:
         try:
             lexer = get_lexer_by_name(lexer, **O_opts)
-        except (OptionError, ValueError), err:
+        except (OptionError, ClassNotFound), err:
             print >>sys.stderr, 'Error:', err
             return 1
 
         if not lexer:
             try:
                 lexer = get_lexer_for_filename(infn, **O_opts)
-            except (OptionError, ValueError), err:
+            except (OptionError, ClassNotFound), err:
                 print >>sys.stderr, 'Error:', err
                 return 1
 
 
     # ... and do it!
     try:
+        # process filters
+        for fname, fopts in F_opts:
+            lexer.add_filter(fname, **fopts)
         highlight(code, lexer, fmter, outfile)
     except Exception, err:
         import traceback

pygments/filters/__init__.py

     Module containing filter lookup functions and default
     filters.
 
-    :copyright: 2006-2007 by Armin Ronacher.
+    :copyright: 2006-2007 by Armin Ronacher, Georg Brandl.
     :license: BSD, see LICENSE for more details.
 """
 try:
 import re
 from pygments.token import String, Comment, Keyword, Name, string_to_tokentype
 from pygments.filter import Filter
-from pygments.util import get_list_opt
+from pygments.util import get_list_opt, ClassNotFound
 from pygments.plugin import find_plugin_filters
 
 
-def find_filter(filter, **options):
+def find_filter_class(filter):
     """
-    Lookup a builtin filter. Options are passed to the
-    filter initialization if wanted.
+    Lookup a filter by name. Return None if not found.
     """
     if filter in FILTERS:
-        return FILTERS[filter](**options)
+        return FILTERS[filter]
     for name, cls in find_plugin_filters():
         if name == filter:
-            return cls(**options)
-    raise ValueError('filter %r not found' % filter)
+            return cls
+    return None
+
+
+def get_filter_by_name(filter, **options):
+    """
+    Return an instantiated filter. Options are passed to the filter
+    initializer if wanted. Raise a ClassNotFound if not found.
+    """
+    cls = find_filter_class(filter)
+    if cls:
+        return cls(**options)
+    else:
+        raise ClassNotFound('filter %r not found' % filter)
 
 
 def get_all_filters():
     """
-    Return a generator for all filters by name.
+    Return a generator of all filter names.
     """
     for name in FILTERS:
         yield name
 
 class CodeTagFilter(Filter):
     """
-    Highlights special code tags in comments and docstrings. Per default, the
-    list of highlighted tags is ``XXX``, ``TODO``, ``BUG`` and ``NOTE``. You can
-    override this list by specifying a `codetags` parameter that takes a list of
-    words.
+    Highlight special code tags in comments and docstrings.
+
+    Per default, the list of highlighted tags is ``XXX``, ``TODO``, ``BUG`` and
+    ``NOTE``. You can override this list by specifying a `codetags` parameter
+    that takes a list of words.
     """
     def __init__(self, **options):
         Filter.__init__(self, **options)
 
 class KeywordCaseFilter(Filter):
     """
-    Converts keywords to ``lower``, ``upper`` or ``capitalize`` which means
-    first letter uppercase, rest lowercase. This can be useful e.g. if you
-    highlight Pascal code and want to adapt the code to your styleguide. The
-    default is ``lower``, override that by providing the `keywordcase`
-    parameter.
+    Convert keywords to ``lower``, ``upper`` or ``capitalize`` which means
+    first letter uppercase, rest lowercase.
+
+    This can be useful e.g. if you highlight Pascal code and want to adapt the
+    code to your styleguide. The default is ``lower``, override that by
+    providing the `case` parameter.
     """
 
     def __init__(self, **options):
         Filter.__init__(self, **options)
-        case = options.get('keywordcase', 'lower')
+        case = options.get('case', 'lower')
         if case not in ('lower', 'upper', 'capitalize'):
             raise TypeError('unknown conversion method %r' % case)
         self.convert = getattr(unicode, case)
 
 class NameHighlightFilter(Filter):
     """
-    Highlight normal name token with a different one::
+    Highlight a normal Name token with a different token type.
+
+    Example::
 
         filter = NameHighlightFilter(
-            highlight=['foo', 'bar', 'baz'],
-            highlight_token=Name.Function
+            names=['foo', 'bar', 'baz'],
+            tokentype=Name.Function,
         )
 
     This would highlight the names "foo", "bar" and "baz"
 
     def __init__(self, **options):
         Filter.__init__(self, **options)
-        self.words = set(get_list_opt(options, 'highlight', []))
-        highlight_token = options.get('highlight_token')
-        if highlight_token:
-            self.highlight_token = string_to_tokentype(highlight_token)
+        self.names = set(get_list_opt(options, 'names', []))
+        tokentype = options.get('tokentype')
+        if tokentype:
+            self.tokentype = string_to_tokentype(tokentype)
         else:
-            self.highlight_token = Name.Function
+            self.tokentype = Name.Function
 
     def filter(self, lexer, stream):
         for ttype, value in stream:
-            if ttype is Name and value in self.words:
-                yield self.highlight_token, value
+            if ttype is Name and value in self.names:
+                yield self.tokentype, value
             else:
                 yield ttype, value
 
 
 FILTERS = {
-    'codetagify':           CodeTagFilter,
-    'keywordcase':          KeywordCaseFilter,
-    'highlight':            NameHighlightFilter
+    'codetagify':     CodeTagFilter,
+    'keywordcase':    KeywordCaseFilter,
+    'highlight':      NameHighlightFilter,
 }

pygments/formatter.py

         Overrides ``encoding`` if given.
     """
 
+    #: Name of the formatter
+    name = None
+
+    #: Shortcuts for the formatter
+    aliases = []
+
+    #: fn match rules
+    filenames = []
+
     #: If True, this formatter outputs Unicode strings when no encoding
     #: option is given.
     unicodeoutput = True

pygments/formatters/__init__.py

     :license: BSD, see LICENSE for more details.
 """
 import os.path
-from pygments.formatters.html import HtmlFormatter
-from pygments.formatters.terminal import TerminalFormatter
-from pygments.formatters.latex import LatexFormatter
-from pygments.formatters.rtf import RtfFormatter
-from pygments.formatters.bbcode import BBCodeFormatter
-from pygments.formatters.other import NullFormatter, RawTokenFormatter
+import fnmatch
+
+from pygments.formatters._mapping import FORMATTERS
 from pygments.plugin import find_plugin_formatters
+from pygments.util import docstring_headline, ClassNotFound
 
+ns = globals()
+for cls in FORMATTERS:
+    ns[cls.__name__] = cls
 
-def _doc_desc(obj):
-    if not obj.__doc__:
-        return ''
-    res = []
-    for line in obj.__doc__.strip().splitlines():
-        if line.strip():
-            res.append(" " + line.strip())
-        else:
-            break
-    return ''.join(res)
+__all__ = ['get_formatter_by_name', 'get_formatter_for_filename',
+           'get_all_formatters'] + [cls.__name__ for cls in FORMATTERS]
+           
 
-
-#: Map formatter classes to ``(longname, names, file extensions, descr)``.
-FORMATTERS = {
-    HtmlFormatter:        ('HTML', ('html',), ('.htm', '.html'),
-                           _doc_desc(HtmlFormatter)),
-    LatexFormatter:       ('LaTeX', ('latex', 'tex'), ('.tex',),
-                           _doc_desc(LatexFormatter)),
-    RtfFormatter:         ('RTF', ('rtf',), ('.rtf',),
-                           _doc_desc(RtfFormatter)),
-    TerminalFormatter:    ('Terminal', ('terminal', 'console'), (),
-                           _doc_desc(TerminalFormatter)),
-    BBCodeFormatter:      ('BBcode', ('bbcode', 'bb'), (),
-                           _doc_desc(BBCodeFormatter)),
-    RawTokenFormatter:    ('Raw tokens', ('raw', 'tokens'), ('.raw',),
-                           _doc_desc(RawTokenFormatter)),
-    NullFormatter:        ('Text only', ('text', 'null'), ('.txt',),
-                           _doc_desc(NullFormatter)),
-}
-
-
-_formatter_cache = {}
+_formatter_alias_cache = {}
+_formatter_filename_cache = []
 
 def _init_formatter_cache():
-    if _formatter_cache:
+    if _formatter_alias_cache:
         return
-    for cls, info in FORMATTERS.iteritems():
-        for alias in info[1]:
-            _formatter_cache[alias] = cls
-        for ext in info[2]:
-            _formatter_cache["/"+ext] = cls
-    for name, cls in find_plugin_formatters():
-        _formatter_cache[name] = cls
+    for cls in get_all_formatters():
+        for alias in cls.aliases:
+            _formatter_alias_cache[alias] = cls
+        for fn in cls.filenames:
+            _formatter_filename_cache.append((fn, cls))
+
+
+def find_formatter_class(name):
+    _init_formatter_cache()
+    cls = _formatter_alias_cache.get(name, None)
+    return cls
 
 
 def get_formatter_by_name(name, **options):
     _init_formatter_cache()
-    cls = _formatter_cache.get(name, None)
+    cls = _formatter_alias_cache.get(name, None)
     if not cls:
-        raise ValueError("No formatter found for name %r" % name)
+        raise ClassNotFound("No formatter found for name %r" % name)
     return cls(**options)
 
 
 def get_formatter_for_filename(fn, **options):
     _init_formatter_cache()
-    # try by filename extension
-    cls = _formatter_cache.get("/"+os.path.splitext(fn)[1], None)
-    if cls:
-        return cls(**options)
-    # try by whole file name
-    cls = _formatter_cache.get("/"+os.path.basename(fn), None)
-    if not cls:
-        raise ValueError("No formatter found for file name %r" % fn)
-    return cls(**options)
+    fn = os.path.basename(fn)
+    for pattern, cls in _formatter_filename_cache:
+        if fnmatch.fnmatch(fn, pattern):
+            return cls(**options)
+    raise ClassNotFound("No formatter found for file name %r" % fn)
 
 
 def get_all_formatters():

pygments/formatters/_mapping.py

+# -*- coding: utf-8 -*-
+"""
+    pygments.formatters._mapping
+    ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+    Formatter mapping defintions. This file is generated by itself. Everytime
+    you change something on a builtin formatter defintion, run this script from
+    the formatters folder to update it.
+
+    Do not alter the FORMATTERS dictionary by hand.
+
+    :copyright: 2006-2007 by Armin Ronacher, Georg Brandl.
+    :license: BSD, see LICENSE for more details.
+"""
+
+from pygments.util import docstring_headline
+
+# start
+from pygments.formatters.bbcode import BBCodeFormatter
+from pygments.formatters.html import HtmlFormatter
+from pygments.formatters.latex import LatexFormatter
+from pygments.formatters.other import NullFormatter
+from pygments.formatters.other import RawTokenFormatter
+from pygments.formatters.rtf import RtfFormatter
+from pygments.formatters.terminal import TerminalFormatter
+
+FORMATTERS = {
+    BBCodeFormatter: ('BBCode', ('bbcode', 'bb'), (), 'Format tokens with BBcodes. These formatting codes are used by many bulletin boards, so you can highlight your sourcecode with pygments before posting it there.'),
+    HtmlFormatter: ('HTML', ('html',), ('*.html', '*.htm'), "Format tokens as HTML 4 ``<span>`` tags within a ``<pre>`` tag, wrapped in a ``<div>`` tag. The ``<div>``'s CSS class can be set by the `cssclass` option."),
+    LatexFormatter: ('LaTeX', ('latex', 'tex'), ('*.tex',), 'Format tokens as LaTeX code. This needs the `fancyvrb` and `color` standard packages.'),
+    NullFormatter: ('Text only', ('text', 'null'), ('*.txt',), 'Output the text unchanged without any formatting.'),
+    RawTokenFormatter: ('Raw tokens', ('raw', 'tokens'), ('*.raw',), 'Format tokens as a raw representation for storing token streams.'),
+    RtfFormatter: ('RTF', ('rtf',), ('*.rtf',), 'Format tokens as RTF markup. This formatter automatically outputs full RTF documents with color information and other useful stuff. Perfect for Copy and Paste into Microsoft\xc2\xae Word\xc2\xae documents.'),
+    TerminalFormatter: ('Terminal', ('terminal', 'console'), (), 'Format tokens with ANSI color sequences, for output in a text console. Color sequences are terminated at newlines, so that paging the output works correctly.')
+}
+
+if __name__ == '__main__':
+    import sys
+    import os
+
+    # lookup formatters
+    found_formatters = []
+    imports = []
+    sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))
+    for filename in os.listdir('.'):
+        if filename.endswith('.py') and not filename.startswith('_'):
+            module_name = 'pygments.formatters.%s' % filename[:-3]
+            print module_name
+            module = __import__(module_name, None, None, [''])
+            for formatter_name in module.__all__:
+                imports.append((module_name, formatter_name))
+                formatter = getattr(module, formatter_name)
+                found_formatters.append(
+                    '%s: %r' % (formatter_name,
+                                (formatter.name,
+                                 tuple(formatter.aliases),
+                                 tuple(formatter.filenames),
+                                 docstring_headline(formatter))))
+    # sort them, that should make the diff files for svn smaller
+    found_formatters.sort()
+    imports.sort()
+
+    # extract useful sourcecode from this file
+    f = file(__file__)
+    try:
+        content = f.read()
+    finally:
+        f.close()
+    header = content[:content.find('# start')]
+    footer = content[content.find("if __name__ == '__main__':"):]
+
+    # write new file
+    f = file(__file__, 'w')
+    f.write(header)
+    f.write('# start\n')
+    f.write('\n'.join(['from %s import %s' % imp for imp in imports]))
+    f.write('\n\n')
+    f.write('FORMATTERS = {\n    %s\n}\n\n' % ',\n    '.join(found_formatters))
+    f.write(footer)
+    f.close()

pygments/formatters/bbcode.py

 
 class BBCodeFormatter(Formatter):
     """
-    Formats tokens with BBcodes. These formatting codes are used by many
+    Format tokens with BBcodes. These formatting codes are used by many
     bulletin boards, so you can highlight your sourcecode with pygments before
     posting it there.
 
         If set to true, add a tag to show the code with a monospace font
         (default: ``false``).
     """
+    name = 'BBCode'
+    aliases = ['bbcode', 'bb']
+    filenames = []
 
     def __init__(self, **options):
         Formatter.__init__(self, **options)

pygments/formatters/html.py

 
 
 class HtmlFormatter(Formatter):
-    """
+    r"""
     Format tokens as HTML 4 ``<span>`` tags within a ``<pre>`` tag, wrapped
     in a ``<div>`` tag. The ``<div>``'s CSS class can be set by the `cssclass`
     option.
     and/or "full document" wrappers if the respective options are set.
     """
 
+    name = 'HTML'
+    aliases = ['html']
+    filenames = ['*.html', '*.htm']
+
     def __init__(self, **options):
         Formatter.__init__(self, **options)
         self.nowrap = get_bool_opt(options, 'nowrap', False)

pygments/formatters/latex.py

         using this prefix and some letters (default: ``'C'``).
         *New in Pygments 0.7.*
     """
+    name = 'LaTeX'
+    aliases = ['latex', 'tex']
+    filenames = ['*.tex']
 
     def __init__(self, **options):
         Formatter.__init__(self, **options)

pygments/formatters/other.py

     """
     Output the text unchanged without any formatting.
     """
+    name = 'Text only'
+    aliases = ['text', 'null']
+    filenames = ['*.txt']
+    
     def format(self, tokensource, outfile):
         enc = self.encoding
         for ttype, value in tokensource:
 
 class RawTokenFormatter(Formatter):
     r"""
-    Formats tokens as a raw representation for storing token streams.
+    Format tokens as a raw representation for storing token streams.
 
     The format is ``tokentype<TAB>repr(tokenstring)\n``. The output can later
     be converted to a token stream with the `RawTokenLexer`, described in the
         If set to ``'gz'`` or ``'bz2'``, compress the output with the given
         compression algorithm after encoding (default: ``''``).
     """
+    name = 'Raw tokens'
+    aliases = ['raw', 'tokens']
+    filenames = ['*.raw']
 
     unicodeoutput = False
 

pygments/formatters/rtf.py

 
 class RtfFormatter(Formatter):
     """
-    Formats tokens as RTF markup. This formatter automatically outputs full RTF
+    Format tokens as RTF markup. This formatter automatically outputs full RTF
     documents with color information and other useful stuff. Perfect for Copy and
     Paste into Microsoft® Word® documents.
 
         The used font famliy, for example ``Bitstream Vera Sans``. Defaults to
         some generic font which is supposed to have fixed width.
     """
+    name = 'RTF'
+    aliases = ['rtf']
+    filenames = ['*.rtf']
 
     unicodeoutput = False
 

pygments/formatters/terminal.py

 
 class TerminalFormatter(Formatter):
     r"""
-    Formats tokens with ANSI color sequences, for output in a text console.
+    Format tokens with ANSI color sequences, for output in a text console.
     Color sequences are terminated at newlines, so that paging the output
     works correctly.
 
         A dictionary mapping token types to (lightbg, darkbg) color names or
         ``None`` (default: ``None`` = use builtin colorscheme).
     """
+    name = 'Terminal'
+    aliases = ['terminal', 'console']
+    filenames = []
 
     def __init__(self, **options):
         Formatter.__init__(self, **options)

pygments/lexer.py

     from sets import Set as set
 
 from pygments.filter import apply_filters, Filter
-from pygments.filters import find_filter
+from pygments.filters import get_filter_by_name
 from pygments.token import Error, Text, Other, _TokenType
 from pygments.util import get_bool_opt, get_int_opt, get_list_opt, \
      make_analysator
         Add a new stream filter to this lexer.
         """
         if not isinstance(filter, Filter):
-            filter = find_filter(filter, **options)
+            filter = get_filter_by_name(filter, **options)
         self.filters.append(filter)
 
     def analyse_text(text):
     def get_tokens(self, text, unfiltered=False):
         """
         Return an iterable of (tokentype, value) pairs generated from
-        `text`. If `unfiltered` is set to `True` the filtering mechanism
-        is bypassed, even if filters are defined.
+        `text`. If `unfiltered` is set to `True`, the filtering mechanism
+        is bypassed even if filters are defined.
 
         Also preprocess the text, i.e. expand tabs and strip it if
         wanted and applies registered filters.

pygments/lexers/__init__.py

 
 from pygments.lexers._mapping import LEXERS
 from pygments.plugin import find_plugin_lexers
+from pygments.util import ClassNotFound
 
 
-__all__ = ['get_lexer_by_name', 'get_lexer_for_filename',
+__all__ = ['get_lexer_by_name', 'get_lexer_for_filename', 'find_lexer_class',
            'guess_lexer'] + LEXERS.keys()
 
 _lexer_cache = {}
         yield lexer.name, lexer.aliases, lexer.filenames, lexer.mimetypes
 
 
+def find_lexer_class(name):
+    """
+    Lookup a lexer class by name. Return None if not found.
+    """
+    if name in _lexer_cache:
+        return _lexer_cache[name]
+    # lookup builtin lexers
+    for module_name, lname, aliases, _, _ in LEXERS.itervalues():
+        if name == lname:
+            _load_lexers(module_name)
+            return _lexer_cache[name]
+    # continue with lexers from setuptools entrypoints
+    for cls in find_plugin_lexers():
+        if cls.name == name:
+            return cls
+
+
 def get_lexer_by_name(_alias, **options):
     """
     Get a lexer by an alias.
     for cls in find_plugin_lexers():
         if _alias in cls.aliases:
             return cls(**options)
-    raise ValueError('no lexer for alias %r found' % _alias)
+    raise ClassNotFound('no lexer for alias %r found' % _alias)
 
 
 def get_lexer_for_filename(_fn, **options):
         for filename in cls.filenames:
             if fnmatch.fnmatch(fn, filename):
                 return cls(**options)
-    raise ValueError('no lexer for filename %r found' % _fn)
+    raise ClassNotFound('no lexer for filename %r found' % _fn)
 
 
 def get_lexer_for_mimetype(_mime, **options):
     for cls in find_plugin_lexers():
         if _mime in cls.mimetypes:
             return cls(**options)
-    raise ValueError('no lexer for mimetype %r found' % _mime)
+    raise ClassNotFound('no lexer for mimetype %r found' % _mime)
 
 
 def _iter_lexerclasses():
             if fnmatch.fnmatch(fn, filename):
                 matching_lexers.add(lexer)
     if not matching_lexers:
-        raise ValueError('no lexer for filename %r found' % fn)
+        raise ClassNotFound('no lexer for filename %r found' % fn)
     if len(matching_lexers) == 1:
         return matching_lexers.pop()(**options)
     result = []
         if rv > best_lexer[0]:
             best_lexer[:] = (rv, lexer)
     if not best_lexer[0] or best_lexer[1] is None:
-        raise ValueError('no lexer matching the text found')
+        raise ClassNotFound('no lexer matching the text found')
     return best_lexer[1](**options)
 
 

pygments/styles/__init__.py

     :copyright: 2006-2007 by Georg Brandl.
     :license: BSD, see LICENSE for more details.
 """
+
 from pygments.plugin import find_plugin_styles
+from pygments.util import ClassNotFound
 
 
 #: Maps style names to 'submodule::classname'.
     try:
         mod = __import__('pygments.styles.' + mod, None, None, [cls])
     except ImportError:
-        raise ValueError("Could not find style module %r" % mod +
+        raise ClassNotFound("Could not find style module %r" % mod +
                          (builtin and ", though it should be builtin") + ".")
     try:
         return getattr(mod, cls)
     except AttributeError:
-        raise ValueError("Could not find style class %r in style module." % cls)
+        raise ClassNotFound("Could not find style class %r in style module." % cls)
 
 
 def get_all_styles():

pygments/styles/autumn.py

 
 
 class AutumnStyle(Style):
+    """
+    A colorful style, inspired by the terminal highlighting style.
+    """
 
     default_style = ""
 

pygments/styles/borland.py

 
 
 class BorlandStyle(Style):
+    """
+    Style similar to the style used in the borland IDEs.
+    """
 
     default_style = ''
 

pygments/styles/colorful.py

 
 
 class ColorfulStyle(Style):
+    """
+    A colorful style, inspired by CodeRay.
+    """
 
     default_style = ""
 

pygments/styles/friendly.py

 
 
 class FriendlyStyle(Style):
+    """
+    A modern style based on the VIM pyte theme.
+    """
 
     background_color = "#f0f0f0"
     default_style = ""

pygments/styles/fruity.py

     Generic, Number, String
 
 class FruityStyle(Style):
+    """
+    Pygments version of the "native" vim theme.
+    """
 
     background_color = '#111111'
 

pygments/styles/manni.py

 
 
 class ManniStyle(Style):
+    """
+    A colorful style, inspired by the terminal highlighting style.
+    """
 
     background_color = '#f0f3f3'
 

pygments/styles/murphy.py

 
 
 class MurphyStyle(Style):
+    """
+    Murphy's style from CodeRay.
+    """
 
     default_style = ""
 

pygments/styles/native.py

 
 
 class NativeStyle(Style):
+    """
+    Pygments version of the "native" vim theme.
+    """
 
     background_color = '#202020'
 

pygments/styles/pastie.py

 
 
 class PastieStyle(Style):
+    """
+    Style similar to the pastie default style.
+    """
 
     default_style = ''
 

pygments/styles/perldoc.py

 
 
 class PerldocStyle(Style):
+    """
+    Style similar to the style used in the perldoc code blocks.
+    """
 
     background_color = '#eeeedd'
     default_style = ''

pygments/styles/trac.py

 
 
 class TracStyle(Style):
+    """
+    Port of the default trac highlighter design.
+    """
 
     default_style = ''
 
 tag_re = re.compile(r'<(.+?)(\s.*?)?>.*?</\1>(?uism)')
 
 
+class ClassNotFound(ValueError):
+    """
+    If one of the get_*_by_* functions didn't find a matching class.
+    """
+
+
 class OptionError(Exception):
     pass
 
                           val, optname))
 
 
+def docstring_headline(obj):
+    if not obj.__doc__:
+        return ''
+    res = []
+    for line in obj.__doc__.strip().splitlines():
+        if line.strip():
+            res.append(" " + line.strip())
+        else:
+            break
+    return ''.join(res).lstrip()
+
+
 def make_analysator(f):
     """
     Return a static text analysation function that
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.