load_workbook (excel.py) does not accept binary files which have "encoding" attr set to None

Issue #433 resolved
Tony Cox created an issue

We currently have a file uploader which uses Pyramid to generate a file object in memory. This file is opened in mode 'wb', but have an encoding attribute which has no value (is None). In previous versions of openpyxl this was working fine, but this change (lines 112-116) has broken it for us.

It needs to check if the file object has the encoding attribute but allow it to pass if it has the attribute but the attribute is None.

Comments (14)

  1. CharlieC

    Sorry about that. The change was made in theory to improve the handling of such objects (see #415). You should be okay if you pin to 2.2b1 until we have a patch release. Can you submit a test case?

  2. Tony Cox reporter

    Great, thanks for the quick response. 2.2b1 will be fine for now. This is actually the first time I've posted in a forum on Bitbucket; by 'submit a test case', do you just mean to paste some code in a reply in here that clearly shows the issue? Cheers.

  3. CharlieC

    We really need a test case for this. Otherwise you'll have to fix whatever is setting an encoding to be None to create either a string or bytes object.

  4. Tony Cox reporter

    I had a quick look at what is required to submit a piece of test code to show simply how this is occurring, but it would require a large amount of work and I don't really have the time at the moment, unfortunately.

    The 'thing' which is setting encoding to be None is coming from the Pyramid package, so cannot be 'fixed' as such. We are running a webserver deep within our framework which allows file uploads. Pyramid receives a request header for the file object and creates an instance of the FieldStorage class which is effectively a file-type object stored in memory with some information regarding the http headers that were associated with the request. We have previously passed this as a file into load_workbook() with no problems: it has recognised the file, which is not a file on disk but is stored in memory, as a file-type object opened in 'wb' mode and processed it accordingly. When Pyramid creates these instances of FieldStorage, the 'file' attribute (which is an open file in memory) of the FieldStorage object has an attribute named 'encoding', which is set to None. This is what is causing load_workbook() to break when called.

    There's quite a bit of work in submitting a test case, as I would need to create a webserver on Pyramid, run some html on it with a form that accepts a file upload, and so on.

  5. CharlieC

    Can you try patching openpyxl with this and let me know if the problem is resolved?

    --- a/openpyxl/reader/excel.py Fri Mar 27 14:38:54 2015 +0100 +++ b/openpyxl/reader/excel.py Fri Mar 27 15:26:30 2015 +0100 @@ -131,7 +131,7 @@ if is_file_like: # fileobject must have been opened with 'rb' flag # it is required by zipfile - if hasattr(filename, 'encoding'): + if getattr(filename, 'encoding', None) is not None: raise IOError("File-object must be opened in binary mode")

  6. jamesbroadhead

    For future reference-- files opened on *nixes with 'rb' may have encoding set to None. Thanks for the fix.

  7. CharlieC

    Without a test it's just guess work but I'm not sure why anyone would want to pass in an open file.

  8. CharlieC

    It's not about binary mode. The change was made because of #415 related to in memory worksheets. In addition, the old code had virtually no unit tests.

  9. Log in to comment