String type is returned in Python 2.7 when content is a newline

Issue #1349 new
Anonymous created an issue

I've reported this issue in the Trac Project with Pygments 2.2. It can be reproduced in Python 2.7 with:

import io

from pygments.formatters.html import HtmlFormatter
from pygments.lexers import get_lexer_by_name


lexer_options = {'stripnl': False}
lexer_name = 'ipython2'

content = """
"""

out = io.StringIO()
lexer = get_lexer_by_name(lexer_name, **lexer_options)
formatter = HtmlFormatter(nowrap=True)
formatter.format(lexer.get_tokens(content), out)

assert '\n' == out.getvalue()

An error results: TypeError: unicode argument expected, got 'str'

The exception is not seen if adding a single whitespace before the newline. In html.py, the code branches to if line is there is a single whitespace and newline, but branches to else: yield 1, lsep if there's only a newline. The returned newline (lsep) seems to be a byte string rather than a unicode string (str type rather than unicode type in Python 2.7). I haven't tested with Python 3/

A successful workaround is to specify lineseparator as a unicode string.

- formatter = HtmlFormatter(nowrap=True)
+ formatter = HtmlFormatter(nowrap=True, lineseparator=u'\n')

Comments (1)

  1. Log in to comment