Tidy Open Office XML files before committing

Create issue
Issue #1 closed
devuxer created an issue

I recently tried to use this extension with an Excel 2010 (.xlsx) file. It basically didn't work because the XML files inside the zip archive have all their data on one line (compressed), making it impossible for Mercurial to do a line-by-line analysis.

I'm wondering if ZipDoc can be enhanced to do some sort of "XML tidy" operation (which would format the XML data onto multiple lines with proper indenting) on Open Office format XML files prior to committing into the repository. This would allow deltas to be calculated correctly.

Also, when updating, the file would need to be re-compressed (untidied) to enable it to open properly in Excel.

Comments (4)

  1. Antoine JOUANJEAN

    I was looking for the exact same feature.

    I am no python programmer but found a path (for decoding) here : http://stackoverflow.com/questions/749796/pretty-printing-xml-in-python

    import xml.dom.minidom
    xml = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
    pretty_xml_as_string = xml.toprettyxml()

    Unfortunately, I haven't found the solution for encoding without pretty-pritting to reverse to a 1-line document...

    Anyway, could be a really great feature!

    Thanks in advance for your help!

  2. devuxer reporter

    @ant1j, I'm not a python programmer either, but what you found looks pretty promising. I would have to assume there is something out there that could reverse the process.

  3. Log in to comment