unable to read from workbook with [invalid] external link

Issue #575 resolved
Kirill Marchuk created an issue

Hi all

Windows 10, openpyxl 2.3.2

I've got an Excel .xlsx file from our VoIP provider, and I cannot parse it using pyexcel-xlsx due to this exception:

File "C:\Python27\lib\site-packages\openpyxl\workbook\names\external.py", line 83, in parse_ranges names = book.find('{%s}definedNames' % SHEET_MAIN_NS) AttributeError: 'NoneType' object has no attribute 'find'

This is a place where it happens:

def parse_ranges(xml):
    tree = fromstring(xml)
    book = tree.find('{%s}externalBook' % SHEET_MAIN_NS)
    names = book.find('{%s}definedNames' % SHEET_MAIN_NS)

the "xml" parameter is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <externalLink xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14" xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main"><oleLink xmlns:r="http://schemas.openxmlformats.org/office Document/2006/relationships" r:id="rId1" progId="Word.Document.12"><oleItems><oleItem name="!OLE_LINK1" advise="1" preferPic="1"/></oleItems></oleLink> </externalLink>

apparently, there's not "externalBook" element, so book is set to None and this condition is not checked in the code

Would you say it's a bug or just faulty Workbook ? It works in Excel, otherwise, with a warning about invalid external links. If I remove links from file, it's processed fine with openpyxl

Comments (4)

  1. Kirill Marchuk reporter

    Here it is. It surely contains broken link, but besides giving a warning, it's perfectly functional in a spreadsheet software (Excel 2013, Open Office). I believe this case should be considered in openpyxl as well.

  2. CharlieC

    Thanks for the file. It looks like the file is supposed to contain an embedded MS Word file. We'll have to investigate this a bit further.

  3. Log in to comment