- changed status to closed
Instantiating a freetype Font with a bytes path incorrectly decodes it
Ping @Lenard Lindstrom - this is related to but different from #196.
If pygame.freetype.Font
is instantiated with a bytes path, which is common on Python 2, this line attempts to decode it with the raw_unicode_escape
codec; that means that any \U
subsequence will be treated as a unicode escape, and if it's not a valid one, it will get replaced with the unicode replacement character, U+FFFD.
>>> pygame.freetype.Font('C:\\Users\\...').path u'C:\ufffdsers\\...'
As I mentioned on #107, this causes a test failure on Python 2, when the installation path includes \U
:
====================================================================== FAIL: test_freetype_Font_path (pygame.tests.freetype_test.FreeTypeFontTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\Thomas\Miniconda3\envs\pygame-py2\lib\site-packages\pygame\test s\freetype_test.py", line 1127, in test_freetype_Font_path self.assertEqual(self._TEST_FONTS['sans'].path, self._sans_path) AssertionError: u'C:\ufffdsers\\Thomas\\Miniconda3\\envs\\pygame-py2\\lib\\site- packages\\pygame\\tests\\fixtures\\fonts\\test_sans.ttf' != 'C:\\Users\\Thomas\\ Miniconda3\\envs\\pygame-py2\\lib\\site-packages\\pygame\\tests\\fixtures\\fonts \\test_sans.ttf'
Resolving this isn't entirely simple, because on POSIX platforms paths are passed around as bytes, and on Windows paths are (preferably) passed around as unicode strings. But I'm pretty sure that decoding with 'raw_unicode_escape' is unexpected in either case.
I'd propose that Font objects have attributes path
(unicode) and pathb
(bytes) on both platforms. The implementation would use path
to open the file on Windows and pathb
on POSIX.
On Windows, the argument should preferably be unicode.
-
If passed unicode: encode with fs encoding and the 'strict' handler - if it can't be encoded, then
pathb
is None. -
If passed bytes: decode with fs encoding, fail if it can't be decoded (though I think that can't occur in standard locales).
On POSIX:
- If passed unicode: encode with fs encoding and the 'strict' handler, fail if it can't be encoded.
- If passed bytes: decode with fs encoding, and if it can't be decoded, then
pathb
is set andpath
is None.
If it's bytes, we try to decode it using the default filesystem encoding
Comments (2)
-
-
- changed version to 1.9.2
- Log in to comment
Use filesystem encoding rather to decode paths rather than unicode_escape codec
Closes issue
#302→ <<cset 518bc4de88a6>>