Update C and C++ lexers to support more literal formats

#723 Open
Repository
pygments
Branch
cpp_literals
Repository
pygments-main
Branch
default

Bitbucket cannot automatically merge this request.

The commits that make up this pull request have been removed.

Bitbucket cannot automatically merge this request due to conflicts.

Review the conflicts on the Overview tab. You can then either decline the request or merge it manually on your local system using the following commands:

hg update default
hg pull -r cpp_literals https://bitbucket.org/drhouck/pygments
hg merge cpp_literals
hg commit -m 'Merged in drhouck/pygments/cpp_literals (pull request #723)'
Author
  1. Daniel Houck
Reviewers
Description

This fixes #1121, as well as user-defined literals and hexadecimal floating-point literals.

  • Issues #1121: C++ literal errors new

Comments (4)

  1. Vladimír Vondruš

    Hey, thanks for this PR! I applied this patch to my local Pygments installation and found a bug -- it can happen that get_tokens_unprocessed() can forget a stray prev after exiting the loop, result in tokens missing in the output stream. A minimal test case for this, with your code the second 0 would be missing from output:

    struct Foo {
        virtual int a() = 0;
        virtual int b() = 0;
    };
    

    The fix is to yield the prev if it's left over after exiting the loop:

    diff --git a/pygments/lexers/c_cpp.py b/pygments/lexers/c_cpp.py
    index dbf50959..c6ad12f0 100644
    --- a/pygments/lexers/c_cpp.py
    +++ b/pygments/lexers/c_cpp.py
    @@ -240,6 +240,9 @@ class CppLexer(CFamilyLexer):
                 else:
                     yield index, token, value
    
    +        if prev:
    +            yield prev["index"], prev["token"], prev["value"]
    +
         tokens = {
             'statements': [
                 (words((
    

    As an aside, I'm maintaining a temporary patched version of Pygments containing this, https://bitbucket.org/birkenfeld/pygments-main/pull-requests/740 and possibly more over at GitHub, until the next version is released: https://github.com/mosra/pygments

  2. Daniel Houck author

    I’ve updated the pull request. In the process of testing, I noticed a preexisting issue where, if there are multiple trailing newlines, all but the last one is removed. I haven’t investigated or fixed this; I suspect it’s in RegexLexer or some other more-general component, and in any case it’s a different issue.

    Because of that I can’t say it leaves the text of files completely unchanged, but this version gives the same text output as version 2.2.0 for all files in my /usr/include folder when called with pygmentize -lcpp -fnull $file. This includes libstdc++, Boost, and of course various C libraries.