Issue #917 resolved

The PHP Lexer does not recognize all PHP identifiers

Christophe Coevoet
created an issue

The regex matching all allowed variable names, function names and class names is documented in the PHP documentation. It is the following one: [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*. See http://www.php.net/manual/en/functions.user-defined.php for instance (it is also available on the other doc pages about variables and classes).

But currently, Pygments seems to be matching only [a-zA-Z_][a-zA-Z0-9_]* (notice the missing \x7f-\xff range in both character classes).

Even though most PHP developers are not aware of this extended definition of what a letter is, some projects have started to use such names. See for instance https://github.com/hoaproject/Core/blob/master/Core.php#L447 Having a working syntax highlighting for these files would be great (Github uses Pygments for this).

Comments (2)

  1. Log in to comment