BashLexer misses keywords

Issue #844 resolved
Daniel Pek
created an issue

When I try to tokenize a script like this:

echo something;if [ "$a" = "b" ];then echo b;fi

the result is:

(Token.Name.Builtin, u'echo ')
(Token.Text, u'something;if')
(Token.Text, u' ')
(Token.Operator, u'[')
(Token.Text, u' ')
(Token.Literal.String.Double, u'"$a"')
(Token.Text, u' ')
(Token.Operator, u'=')
(Token.Text, u' ')
(Token.Literal.String.Double, u'"b"')
(Token.Text, u' ')
(Token.Operator, u']')
(Token.Text, u';')
(Token.Keyword, u'then ')
(Token.Name.Builtin, u'echo ')
(Token.Text, u'b;fi')
(Token.Text, u'\n')

so, sometimes tokens aren't got separated by semicolons which actually should be happened in my opinion.

Comments (7)

  1. Georg Brandl repo owner

    Hi Daniel,

    pygments is not a generic lexer/tokenizer, but a highlighter. For that purpose, tokens with the same type are concatenated as possible so that no redundant output is produced.

    However, it looks like the "if/fi" tokens are not highlighted correctly as keywords, which should be fixed.

  2. Daniel Pek reporter

    Hi Georg,

    I think, you can't highlight "if" keyword until you don't tokenize it correctly. And in some cases, the "if" keyword does get highlighted correctly, so I think it's not that it isn't enlisted as a keyword, but it isn't tokenized well, that's the real reason. But not sure, I may be wrong.


    echo something;if [ "$a" = "b" ]
    echo something; if [ "$a" = "b" ]

    As you see, at the second case highlighting works fine.

  3. Daniel Pek reporter

    Actually ; is the synchronous separator and & is the asynchronous separator, which isn't handled well either:

    echo something&if [ "$a" = "b" ]
    echo something & if [ "$a" = "b" ]
  4. Log in to comment