new lexer request: m4

Issue #1212 new
Sonia Hamilton created an issue

I started on a very, very basic m4 lexer, but couldn't get it to work. My issue will probably get ignored, but here you go:

I find the online documentation unhelpful and out of date. I looked at:

The doco mentions a Makefile, but I couldn't find one. It mentions __ALL__ but I couldn't find it. mentions python and somewhere else I found __all__.

Would you guys just write a simple "soup to nuts" guide to writing a simple Lexer? How to test it, how to set up some test data. No regexes, just plain old text matching.

from pygments.lexers.agile import PythonLexer                                    
from pygments.token import Name, Keyword                                         

__all__ = ['M4Lexer']                                                            

class M4Lexer(PythonLexer):                                                      
    name = 'M4'                                                                  
    aliases = ['m4']                                                             
    filenames = ['*.m4']                                                         
    mimetypes = ['text/x-m4']                                                    

    EXTRA_KEYWORDS = ['divert', 'DNL', 'define']                                 

    def get_tokens_unprocessed(self, text):                                      
        for index, token, value in PythonLexer.get_tokens_unprocessed(self, text):
            if token is Name and value in self.EXTRA_KEYWORDS:                   
                yield index, Keyword.Pseudo, value                               
                yield index, token, value

giving the line in

'M4Lexer': ('pygments.lexers.m4', 'M4', ('m4',), ('*.m4',), ('text/x-m4',)),

Comments (6)

  1. Georg Brandl repo owner

    Hi Sonia, first I'm not trying to ignore anyone, it's just that time is limited and I'm basically one of two people working on this project in our spare time.

    I just had a look at the first page you linked again. It seems to be largely correct still (I found a few things to improve in f86c52317683). The Makefile is in the root directory of the checkout. You perhaps worked with an installed copy, which doesn't contain all the files needed for development (made that explicit in 51dce8c4d83d).

    There is indeed no __ALL__, but in the Pygments sources I couldn't find a reference to it. Maybe some external source misspelled it.

    As for "No regexes, just plain old text matching." I fear that 99% of Pygments lexers are using regexes, because they are (apart from a custom DSL) the most concise way to deal with this kind of text matching.

    I'm not sure how I can further help you. If you have suggestions about parts of the docs that can be changed, I'm all for it.

  2. Sonia Hamilton reporter

    Thanks Georg for your nice reply, I know how hard it is to maintain a public project, I have my own

    I had bit of a snap yesterday, was under stress to deliver a project, in a few days I'll have a go and work things out properly :-)

  3. Log in to comment