Reading list of merged cells impossible in read-only mode

Issue #540 resolved
Hamish Robertson created an issue

The worksheet.merged_cells and worksheet.merged_cell_ranges properties are both empty of the workbook has been opened in read-only mode.

As this is data regarding the status of the worksheet, and in no way coupled to writing to the worksheet, I can see no reason why this would not be included.

Comments (5)

  1. CharlieC

    Unfortunately to do this would impose a potentially huge performance penalty on the read-only mode rendering any optimisation somewhat moot. I'd be happy to review a PR that implements the functionality but am not convinced it is required. Maybe something along the lines of forcing worksheet size calculation would be possible.

    FWIW the specification of merged cells is a total mess.

  2. Hamish Robertson reporter

    Please forgive my ignorance of the complexities, but why would there be such a performance hit from simply reading the mergeCell elements from the XML?

  3. CharlieC

    Because XML is read as a stream and not by position and the merged cell information is below all the cells. So if you want it you must read through all the cells first. This is a sort of anti-pattern even if memory use will remain low because if you want to do anything with cells dependent upon whether they were merged or not then you will subsequently have to read them in again.

    Read-only really is an optimisation for very large (> 30 MB) Excel files where only a subset of data is briefly required: eg. something like "get rows 200 to 4000 from the third sheet".

    The alternative here is explicitly to disallow the merged cell methods and properties so that client code presents fewer surprises.

  4. Log in to comment