1. Georg Brandl
  2. pygments-main
  3. Issues
Issue #710 open

new lexer request: mathematica (with patch)

Anonymous created an issue

Support for highlighting of Mathematica code would be nice.

Comments (13)

  1. Thomas Aglassinger

    I recently wrote a couple of lexers important to me, so if there is not much effort involved I volunteer to implement a basic Mathematica lexer. I never used Mathematica, but a little research revealed the following links to describe the syntax:

    The easiest way to convert this information to a lexer would be text files using UTF-8 encoding where each line contains exactly one word for each keywordoperator/function/constant. If there is more than one way to represent the same thing (e.g \[Pi] and π) there has to be a separate line for each. The actual number of files depends on how useful it seems to have different colors in the lexer. From my point of few, constants.txt, functions.txt and operators.txt seem to be sufficient, but maybe there also is a need for keywords.txt.

    Next, there's a need for an example.m file. Preferably a short one that exercises a lot of the syntax.

    I noticed that Mathematic uses the same files extension as Matlab (.m) so in order for pygments to guess the file type properly there is a need for some heuristic. Are there any syntax constructs that can be found in every Mathematic file and in no Matlab file?

    If anyone's interested in this, build and attach the files described above to this issue and I will turn them into a lexer.

  2. Nikolaus Sonnenschein

    I am not quite sure (it's been a while) but I believe I was the original poster. You're talking about a pygments lexer right? I can provide the necessary files soon.

    Even a very basic implementation would be very helpful. I don't know anything about pygments but I am decent python programmer and would probably be able to get into it. Thank you so much for offering your help.

  3. Thomas Aglassinger

    I've forked pygments and implemented a basic lexer. To obtain it, run:

    hg clone --branch mathematica https://roskakori@bitbucket.org/roskakori/pygments-main
    

    To render the example run:

    ./pygmentize -l mathematica -f html -O full,style=emacs -o /tmp/example.html  tests/examplefiles/mathematica.m
    

    and open the resulting /tmp/example.html in your browser.

    This will probably issue a warning like: .../pygments-main/pygments/plugin.py:39: UserWarning: Module pygments was already imported from .../pygments-main/pygments/__init__.pyc, but /home/.../pygments-main is being added to sys.path.

    To get rid of this install the modified pygments instead of using it from the current path, e.g. by running:

    sudo python setup.py develop
    

    The lexer currently should render comment, numbers, \[named] constants and strings properly. It also does detect several operators though there are probably some missing. It also knows about a few built in function names.

    To improve the lexer, download keywords_mathematica.zip, expand it, add additional operators, functions and constants to the respective text file, attach it to this issue under a new name and add a comment on what you improved.

    Alternatively you can modify the source and issue a pull request on my fork. See MathematicaLexer in pygments/lexers/math.py.

    The zip also includes a little scripts listify.py I used to convert the text files to Python code. Just pass the text file to convert as parameter, e.g.:

    python listify.py operators.txt
    

    Then copy the console output to math.py at the appropriate place.

    The lexer currently detects *.nb and *.nbp files. For *.m files to be recognized as Mathmatica they have to include a (* comment *). Suggestions on how to improve this heuristic are welcome.

    The unit test suite (make test) currently fails because the ObjectiveCLexer tries to process tests/examplefiles/mathematica.m. I've looked into this but I don't know why it does that. To my understanding mathematica.m should be recognized as Objective C with a certainty of 0.10 and as Mathamatica with 0.11, thus Mathematica should win. However it does not. Here is the stack trace you can expect:

    Traceback (most recent call last):
      File ".../nose/case.py", line 197, in runTest
        self.test(*self.arg)
      File ".../pygments-main/tests/test_examplefiles.py", line 71, in check_lexer
        (lx, absfn, val, len(u''.join(ntext)))
    AssertionError: lexer <pygments.lexers.ObjectiveCLexer> generated error token for .../pygments-main/tests/examplefiles/mathematica.m: u'`' at position 191
    
  4. Log in to comment