The PHP Lexer does not recognize all PHP identifiers

The regex matching all allowed variable names, function names and class names is documented in the PHP documentation. It is the following one: [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*. See for instance (it is also available on the other doc pages about variables and classes).

But currently, Pygments seems to be matching only [a-zA-Z_][a-zA-Z0-9_]* (notice the missing \x7f-\xff range in both character classes).

Even though most PHP developers are not aware of this extended definition of what a letter is, some projects have started to use such names. See for instance Having a working syntax highlighting for these files would be great (Github uses Pygments for this).

