The PHP Lexer does not recognize all PHP identifiers

Issue #917 resolved
Christophe Coevoet
created an issue

The regex matching all allowed variable names, function names and class names is documented in the PHP documentation. It is the following one: [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*. See http://www.php.net/manual/en/functions.user-defined.php for instance (it is also available on the other doc pages about variables and classes).

But currently, Pygments seems to be matching only [a-zA-Z_][a-zA-Z0-9_]* (notice the missing \x7f-\xff range in both character classes).

Even though most PHP developers are not aware of this extended definition of what a letter is, some projects have started to use such names. See for instance https://github.com/hoaproject/Core/blob/master/Core.php#L447 Having a working syntax highlighting for these files would be great (Github uses Pygments for this).

Comments (2)

  1. Log in to comment