Ruby: Non-ASCII Method Names Not Recognised

Issue #474 new
Anonymous created an issue

Ruby 1.9 allows method names to include non-ASCII characters with the following caveats:

  • The characters must be valid in the file's source encoding.

  • A legal method name that does not end with '!', '?', or '=' may have one of these characters appended.

  • The ASCII punctuation characters of which operator methods consist (e.g. {{{[*%&^`~+-/\[<>=]}}}) must not appear in any other permutation, with the exception of the above case.

Pygments does not recognise such method names, lexing the first non-ASCII character as an error. Examples of unrecognised method names are given in .

Reported by guest

Comments (2)

  1. thatch

    I did some digging. I still can't find a formal announcement, but local rubyers confirm that such support was "rumored."

    Checking the source (ruby 1.9 snapshot, `parse.y`) I see some code for this.

    #define is_identchar(p,e,enc) (rb_enc_isalnum(*p,enc) || (*p) == '_' || !ISASCII(*p))
    #define parser_is_identchar() (!parser->eofp && is_identchar((lex_p-1),lex_pend,parser->enc))
        mb = ENC_CODERANGE_7BIT;
        do {
            if (!ISASCII(c)) mb = ENC_CODERANGE_UNKNOWN;
            if (tokadd_mbchar(c) == -1) return 0;
            c = nextc();
        } while (parser_is_identchar());
        switch (tok()[0]) {
          case '@': case '$':
            if ((c == '!' || c == '?') && !peek('=')) {
            else {
  2. thatch

    Do you have any reference to those rules, or perhaps the grammar itself? I checked the existing RubyLexer's rules and they're super-complicated:

                 bygroups(Name.Class, Operator, Name.Function), '#pop'),
  3. Log in to comment