Hidden columns after copying column formatting

Issue #798 wontfix
Walter Gurr
created an issue

I tracked down some unexpected behavior that can 'hide columns' in 2.4.5; I think it would be worth fixing.

When Excel has consecutive columns with the same formatting it consolidates them to one customWidth xml item, such as this for columns M and N (columns 13 and 14):

<col customWidth="1" style="4" width="8.42578125" max="14" min="13"/> 

If I add a cell in column N in openpyxl, then it will have default formatting, rather than the formatting of the column, so I copy formatting from it's host column like so:

    def copy_column_format(self, cell):
        # Copy column formatting
        cell.number_format = cell.parent.column_dimensions[cell.column].number_format
        cell.font = cell.parent.column_dimensions[cell.column].font.copy()
        cell.alignment = cell.parent.column_dimensions[cell.column].alignment.copy()

Little did I realize when I wrote the code that there is no column_dimension['N'], and that openpyxl would then automatically and quietly create the missing column_dimension with default settings. Thus the new column_dimension has default formatting and a width of 0.

When the sheet is subsequently saved and opened in Excel the column is now hidden, and the data I wrote and column have the wrong formatting.

My workaround will be to check if the column_dimension exists, and if not found keep iterating back to find the relevant preceding column_dimension.

One proposal for openpyxl is that when new cells are created there is an option to apply formatting for that column (the same way Excel does). Perhaps my workaround could also be implemented to avoid the confusing behavior.

Note I also saw this issue on stackoverflow, but I'm not sure if it's the same thing: http://stackoverflow.com/questions/29661324/column-set-to-hidden-when-editing-existing-xlsx-file

Comments (9)

  1. Walter Gurr reporter

    Well, if there's a real performance concern then I could think of a couple options: - Throw an exception, or at least a warning if someone tries to read a column_dimension that doesn't exist. - Change the default width for a new column_dimension to the Excel default (64 pixels/8.43).

    Did I at least get across the problem and how this would be unexpected behavior?

  2. CharlieC

    The performance relates to lookups for every cell and style copying. Use NamedStyles if you want to apply the same style to lots of cells.

    Raising an exception if there is no ColumnDimension makes client code unnecessarily clumsy. The current implementation conforms with the specification. Columns are not hidden by default and should respect the defaultColumnWidth but what this looks like is up to the consuming application.

  3. CharlieC

    This is one of the areas where application like functionality is difficult to do. It is, for example, not documented how styles should be applied and the potential performance implications of always checking whether row and column styles exist for every cell would be considerable. The alternative is for client code to apply named styles to the relevant columns or rows, which is trivial.

  4. Walter Gurr reporter

    Fair enough. Hopefully this thread will clue others as to what's going on if columns mysteriously start hiding on their sheets.

    For my application named styles are less preferable than copying formatting from the sheet as I don't want to define formatting in Python - makes things more maintainable where there's a great deal of formatting to deal with and people want to fiddle with it.

    Here's my brute force code in case anyone else wants to implement something similar in their application. If openpyxl has a utility library then perhaps you could consider adding something similar at some point:

    def copy_column_format(self, cell):
        # Search for applicable column_dimension
        format_column = cell.column
        available_format_columns = cell.parent.column_dimensions.keys()
        while format_column not in available_format_columns:
            if format_column == 'A':
                return
            format_column = xl_ops.col_to_alpha(xl_ops.alpha_to_col(format_column) - 1)
        # Copy column formatting to cell
        cell.number_format = cell.parent.column_dimensions[format_column].number_format
        cell.font = cell.parent.column_dimensions[format_column].font.copy()
        cell.alignment = cell.parent.column_dimensions[format_column].alignment.copy()
    
    def col_to_alpha(col):
        alpha = str(string.ascii_uppercase)
        lower_letter = alpha[col % len(alpha)]
        if (col / len(alpha)) > 0:
            return '%s%s' % (col_to_alpha((col / len(alpha)) - 1), lower_letter)
        else:
            return lower_letter
    
    def alpha_to_col(alpha):
        if len(alpha) == 1:
            return ord(alpha)-65
        elif len(alpha) == 2:
            return ord(alpha[1])-65+(ord(alpha[0])-64)*26
    
  5. CharlieC

    Why do people reimplement their own code for converting between Excel style and numerical coordinates? Sigh.

    It would make a lot more sense to create named styles from the column definitions and apply them. Cleaner and faster.

  6. Log in to comment