crash in the case of wrong encoding

Issue #45 resolved
Former user created an issue

I do not think that pybtex should crash when it reads a file in a wrong encoding. File "/tmp/build/lib.linux-i686-2.6/pybtex/cmdline.py", line 120, in main self.run(options, args) File "/tmp/build/lib.linux-i686-2.6/pybtex/__main__.py", line 158, in run engine.make_bibliography(filename, **kwargs) File "/tmp/build/lib.linux-i686-2.6/pybtex/bibtex/__init__.py", line 53, in make_bibliography interpreter.run(bst_script, aux_data.citations, bib_filenames, bbl_file, min_crossrefs=min_crossrefs) File "/tmp/build/lib.linux-i686-2.6/pybtex/bibtex/interpreter.py", line 216, in run getattr(self, method)(*args) File "/tmp/build/lib.linux-i686-2.6/pybtex/bibtex/interpreter.py", line 267, in command_read self.bib_data = p.parse_files(self.bib_files) File "/tmp/build/lib.linux-i686-2.6/pybtex/database/input/__init__.py", line 58, in parse_files self.parse_file(filename, file_suffix) File "/tmp/build/lib.linux-i686-2.6/pybtex/database/input/__init__.py", line 50, in parse_file self.parse_stream(f) File "/tmp/build/lib.linux-i686-2.6/pybtex/database/input/bibtex.py", line 333, in parse_stream text = stream.read() File "/usr/lib/python2.6/io.py", line 1767, in read decoder.decode(self.buffer.read(), final=True)) File "/usr/lib/python2.6/io.py", line 1319, in decode output = self.decoder.decode(input, final=final) File "/usr/lib/python2.6/codecs.py", line 296, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf8' codec can't decode byte 0xcf in position 35: invalid continuation byte

With a quick fix provided below it at least reports the file name; WARNING: In db-cp1251.bib: 'utf8' codec can't decode byte 0xcf in position 35: invalid continuation byte. Warning--missing database entry for "entry-cp1251"

Actually, I would prefer to have the line number as well in the warning.

modified file 'pybtex/database/input/__init__.py'

--- pybtex/database/input/__init__.py 2011-02-11 15:27:53 +0000 +++ pybtex/database/input/__init__.py 2011-12-25 03:10:13 +0000 @​@​ -26,6 +26,8 @​@​ import pybtex.io from pybtex.plugin import Plugin from pybtex.database import BibliographyData +from pybtex.errors import report_error +from pybtex.exceptions import PybtexError

class BaseParser(Plugin): @​@​ -44,7 +46,11 @​@​ self.filename = filename open_file = pybtex.io.open_unicode if self.unicode_io else pybtex.io.open_raw with open_file(filename, encoding=self.encoding) as f: - self.parse_stream(f) + try: + self.parse_stream(f) + except UnicodeDecodeError as e: + report_error(PybtexError(u'in {0}: {1}'.format(filename, e))) + return self.data

def parse_files(self, base_filenames, file_suffix=None):

Comments (3)

  1. Log in to comment