Loading from a unicode object does not return unicode values.

Issue #25 new
Michel Albert
created an issue

I am backporting a Python 3 application to Python 2 and I hit a roadblock with PyYAML: When loading a unicode object in Python 2, PyYAML returns me a str object instead of a unicode object.

I need to feed the output of that into the ipaddress module (added in py3.3, backport exists). But this module only accepts unicode objects as input. Which makes sense.

The big issue here is that PyYAML somewhere encoded the unicode object I passed into bytes. But it won't tell me which encoding it used! I assume UTF-8? ASCII? Maybe sys.getdefaultencoding()? It rests an assumption to me.

I see two solutions: Either make PyYAML return a unicode object when it received a unicode object on "load", or make it return a 2-tuple containing the result, and the encoding it used internally. That way I can re-decode it, but the first option is still the better one in my opinion.

Comments (0)

  1. Log in to comment