Cannot apply AutoFilter to columns implicitly; only explicit cell ranges are accepted

Issue #772 resolved
Kent Coble
created an issue

Hi there! I think we've found a bug with filters and sorts. In release version 2.3.5, we were able to apply a filter and sort to a column (range) with the following:

ws.auto_filter.ref = 'A:B'
ws.auto_filter.add_filter_column(1, ['Inactive'])
ws.auto_filter.add_sort_condition('A')

We upgraded to 2.4.2 for good measure after hearing about the potential XXE vulnerability. Since then, we cannot apply filters to a whole column anymore (we've tested only 2.4.2 and 2.4.4). When we try to run the first or third lines from the code above, the following error is raised:

  File "stalePortReport.py", line 488, in write
    ws.auto_filter.ref = 'A:B'
  File "/usr/local/lib/python2.7/dist-packages/openpyxl/descriptors/excel.py", line 94, in __set__
    super(CellRange, self).__set__(instance, value)
  File "/usr/local/lib/python2.7/dist-packages/openpyxl/descriptors/base.py", line 255, in __set__
    raise ValueError('Value does not match pattern {0}'.format(self.pattern))
ValueError: Value does not match pattern ^[$]?(?P<min_col>[A-Za-z]{1,3})[$]?(?P<min_row>\d+)(:[$]?(?P<max_col>[A-Za-z]{1,3})[$]?(?P<max_row>\d+)?)?$

To get around this, we must explicitly specify the cells, such as the following:

ws.auto_filter.ref = 'A1:B145'
ws.auto_filter.add_filter_column(1, ['Inactive'])
ws.auto_filter.add_sort_condition('A2:A145')

I'm not sure if our way of using it was actually a bug and got patched or if this was accidentally removed (change to Regex pattern?). All I could find was a note in the 2.4.0-a1 release about Convert AutoFilter to Serialisable and extend support for filters.

I've attached a workbook, should you not be able to replicate this in your own files. It's the second spreadsheet ('Building1') that we're trying to apply the filter to.

Comments (8)

  1. CharlieC

    Thanks but the supplied file doesn't have any filters for that worksheet?

    The changes in 2.4 were mainly to bring the code in line with the rest of the project and increase functionality and reliability. Test coverage for filters went from 65% to 100%.

    I suspect you were relying on largely undocumented and untested code that was reversed engineered from an existing file. This was fine in many ways but did not guarantee roundtripping for existing files.

    The specification § 18.3.2.7 has this to say about autofilters:

    The filterColumn collection identifies a particular column in the AutoFilter range and specifies filter information that has been applied to this column. If a column in the AutoFilter range has no criteria specified, then there is no corresponding filterColumn collection expressed for that column.
    

    So I think you always need to add the filter columns but the regex for the ref could be too strict: unhelpfully the spec just says it can be a string.

    Client code, however, should really change to use the new API so you should manually create a filterColumn, which has many more options than the existing method and append it to the existing ones.

  2. Kent Coble reporter

    Thanks but the supplied file doesn't have any filters for that worksheet?

    I was supplying the workbook in the state it was in when we ran into the Exception. As such, the worksheet does not have filters applied to it. I attached it in case someone wanted to verify that the code gave them the same Exception we were running into (and so they wouldn't have to start from scratch).

    I suspect you were relying on largely undocumented and untested code that was reversed engineered from an existing file. This was fine in many ways but did not guarantee roundtripping for existing files.

    I'm not sure what you're referring to. Our code is almost the same as the docs.

    Sorry, I just realized that I didn't copy the traceback when the exception is raise. Here:

    File "stalePortReport.py", line 488, in write
        ws.auto_filter.ref = 'A:B'
      File "/usr/local/lib/python2.7/dist-packages/openpyxl/descriptors/excel.py", line 94, in __set__
        super(CellRange, self).__set__(instance, value)
      File "/usr/local/lib/python2.7/dist-packages/openpyxl/descriptors/base.py", line 255, in __set__
        raise ValueError('Value does not match pattern {0}'.format(self.pattern))
    ValueError: Value does not match pattern ^[$]?(?P<min_col>[A-Za-z]{1,3})[$]?(?P<min_row>\d+)(:[$]?(?P<max_col>[A-Za-z]{1,3})[$]?(?P<max_row>\d+)?)?$
    
  3. CharlieC

    I am referring to the more complete API now provided. For example, you can now use colour filters and have icon sets (I've no idea whether you'd want this or not). The documentation needs updating but it's advance notice that the add_* methods will probably get deprecated at some point.

    Prior to 2.4 the range for a filter was never syntactically validated so errors were possible which might not have become visible until the file was opened.

  4. CharlieC

    Thanks for the file. It's worth noting that Excel itself always calculates a range in the form ref="A1:B1048576" in this case but I think we can create a regex and a suitable descriptor so that simple column references will be possible.

  5. Log in to comment