Exception when preserving whitespace in strings

Issue #410 resolved
CharlieC
created an issue

lxml raises an Exception when trying to set namespaces on nodes using xml:space. A fully quoted namespace must be used.

Comments (8)

  1. Jake Katz

    There is still a bug here when reloading a worksheet:

    import openpyxl as xl
    expected_string = '         test'
    # Write a field with spacing
    wb = xl.Workbook()
    ws = wb.active
    ws['A1'] = expected_string
    wb.save('test.xlsx')
    
    # Reload the spreadsheet
    wb2 = xl.load_workbook('test.xlsx')
    ws = wb2.active
    
    # These will differ
    print expected_string
    print ws['A1'].value
    

    Edit: When reloading from a spreadsheet created by openpyxl, in openpyxl.readers.strings, the preserve attribute seems to be getting stripped out. In other words:

        if text_node.get('{%s}space' % XML_NS) != 'preserve':  # <--- No attributes, so text is always stripped
            text = text.strip()
    

    Edit2:

    Comparing to a good file produced by excel, the issue is that the xml:space="preserve" attribute needs to be applied to the <t> element not the <si> element.

  2. Jake Katz

    Happy to add a new issue, but I think it the root cause is from the writer not putting the attribute on the correct element while the reader is reading from the correct element. If I change the writer to apply the attribute to the text subelement instead of the <si> element, the round tripping works as expected. It also matches a document created directly in Excel.

  3. Log in to comment