Scala lexer incorrectly highlights code containing type parameters

Issue #1265 new
Quentin Stievenart
created an issue

See the following fragment: http://pygments.org/demo/5862224/ (repeated below for completeness) Code appearing after a trait definition with a type parameter is incorrectly highlighted.

// Incorrect highlighting (on `case`)
trait Foo[A]
case class Foo1(x: A) extends Foo
// Correct highlighting
case class Foo2(x: A) extends Foo

// Incorrect highlighting (on `extends`)
trait Bar[A] extends Foo[A]
// Correct highlighting
trait Bar extends Foo

Comments (3)

  1. Quentin Stievenart reporter

    The source of the problem is the following. When encountering a class/trait definition, pygments will go to the 'class' token of ScalaLexer. If no type parameter is given, the last clause will occur, and pygments will '#pop' the current state, going back to the 'root' token. It can then successfully parse the rest of the code. However, if a type parameter is given, it goes to the 'typeparam' token, which will parse the type parameter until the ']', and then come back to the 'class' token. If no braces are given, as in the example above, then the 'class' token expects the name of the class to appear, and that is why 'case' and 'extends' in the example above are incorrectly highlighted (because they're detected as Name.Class instead of Keyword). To confirm this, we can see that in the 'class' token, there is a clause for single comments (added to fix issue #713), but not for multiline comments. What does it give when we use them after a type parameter?

    trait Foo[A] /* comment comment comment */
    

    As expected, it doesn't highlight correctly. I don't know how to cleanly fix it. I guess that typeparam should pop twice instead of just once. The quick and dirty solution is to add clauses from 'root' to 'class', as was done for single line comments, but that's not clean.

  2. David Corbett

    How about this?

    diff -r e79a7126551c pygments/lexers/jvm.py
    --- a/pygments/lexers/jvm.py    Thu Jun 16 19:10:37 2016 +0200
    +++ b/pygments/lexers/jvm.py    Thu Aug 04 09:41:24 2016 -0400
    @@ -264,8 +264,7 @@
                 # method names
                 (r'(class|trait|object)(\s+)', bygroups(Keyword, Text), 'class'),
                 (r'[^\S\n]+', Text),
    -            (r'//.*?\n', Comment.Single),
    -            (r'/\*', Comment.Multiline, 'comment'),
    +            include('comments'),
                 (u'@%s' % idrest, Name.Decorator),
                 (u'(abstract|ca(?:se|tch)|d(?:ef|o)|e(?:lse|xtends)|'
                  u'f(?:inal(?:ly)?|or(?:Some)?)|i(?:f|mplicit)|'
    @@ -300,15 +299,16 @@
             ],
             'class': [
                 (u'(%s|%s|`[^`]+`)(\\s*)(\\[)' % (idrest, op),
    -             bygroups(Name.Class, Text, Operator), 'typeparam'),
    +             bygroups(Name.Class, Text, Operator), ('#pop', 'typeparam')),
                 (r'\s+', Text),
    +            include('comments'),
                 (r'\{', Operator, '#pop'),
                 (r'\(', Operator, '#pop'),
    -            (r'//.*?\n', Comment.Single, '#pop'),
                 (u'%s|%s|`[^`]+`' % (idrest, op), Name.Class, '#pop'),
             ],
             'type': [
                 (r'\s+', Text),
    +            include('comments'),
                 (r'<[%:]|>:|[#_]|forSome|type', Keyword),
                 (u'([,);}]|=>|=|\u21d2)(\\s*)', bygroups(Operator, Text), '#pop'),
                 (r'[({]', Operator, '#push'),
    @@ -318,16 +318,21 @@
                 (u'((?:%s|%s|`[^`]+`)(?:\\.(?:%s|%s|`[^`]+`))*)(\\s*)$' %
                  (idrest, op, idrest, op),
                  bygroups(Keyword.Type, Text), '#pop'),
    -            (r'//.*?\n', Comment.Single, '#pop'),
                 (u'\\.|%s|%s|`[^`]+`' % (idrest, op), Keyword.Type)
             ],
             'typeparam': [
    -            (r'[\s,]+', Text),
    +            (r'\s+', Text),
    +            include('comments'),
    +            (r',+', Punctuation),
                 (u'<[%:]|=>|>:|[#_\u21D2]|forSome|type', Keyword),
                 (r'([\])}])', Operator, '#pop'),
                 (r'[(\[{]', Operator, '#push'),
                 (u'\\.|%s|%s|`[^`]+`' % (idrest, op), Keyword.Type)
             ],
    +        'comments': [
    +            (r'//.*?\n', Comment.Single),
    +            (r'/\*', Comment.Multiline, 'comment'),
    +        ],
             'comment': [
                 (r'[^/*]+', Comment.Multiline),
                 (r'/\*', Comment.Multiline, '#push'),
    
  3. Log in to comment