Hey, thanks for this PR! I applied this patch to my local Pygments installation and found a bug -- it can happen that get_tokens_unprocessed() can forget a stray prev after exiting the loop, result in tokens missing in the output stream. A minimal test case for this, with your code the second 0 would be missing from output:
The fix is to yield the prev if it's left over after exiting the loop:
I’ve updated the pull request. In the process of testing, I noticed a preexisting issue where, if there are multiple trailing newlines, all but the last one is removed. I haven’t investigated or fixed this; I suspect it’s in RegexLexer or some other more-general component, and in any case it’s a different issue.
Because of that I can’t say it leaves the text of files completely unchanged, but this version gives the same text output as version 2.2.0 for all files in my /usr/include folder when called with pygmentize -lcpp -fnull $file. This includes libstdc++, Boost, and of course various C libraries.