ValueError: could not convert string to float: '$K$6'

Issue #699 resolved
Barnaby Keene
created an issue

Apologies for the arbitrary title, I can't seem to figure out exactly what's going on here.

I'm using the latest stable (pip) Python 3.5 version of openpyxl (2.4.0 according to pip show openpyxl) on Debian 8, when trying to extract this file: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/451064/20150803_People_Unformatted_Tables_FY_2014_15_Revised2_O.xlsx openpyxl throws this exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 57, in _convert
    value = expected_type(value)
ValueError: could not convert string to float: '$K$6'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "extract_test.py", line 37, in <module>
    text = loop.run_until_complete(coro)
  File "/usr/local/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception
  File "/usr/local/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(None)
  File "/home/southclaw/work/hcc_extract_worker_py/textract/api.py", line 65, in process
    result = await parser(filename, **kwargs)
  File "/home/southclaw/work/hcc_extract_worker_py/textract/utils/parser.py", line 49, in __call__
    data = await self.extract(filename, encoding=encoding, **kw)
  File "/home/southclaw/work/hcc_extract_worker_py/textract/parsers/xlsx.py", line 14, in extract
    wb = load_workbook(filename, data_only=True)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/excel.py", line 225, in load_workbook
    ws_parser.parse()
  File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/worksheet.py", line 122, in parse
    dispatcher[tag_name](element)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/worksheet.py", line 281, in parser_conditional_formatting
    cf = ConditionalFormatting.from_tree(element)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 76, in from_tree
    obj = desc.expected_type.from_tree(el)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 76, in from_tree
    obj = desc.expected_type.from_tree(el)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 76, in from_tree
    obj = desc.expected_type.from_tree(el)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 89, in from_tree
    return cls(**attrib)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/formatting/rule.py", line 53, in __init__
    self.val = val
  File "/usr/local/lib/python3.5/site-packages/openpyxl/formatting/rule.py", line 32, in __set__
    super(ValueDescriptor, self).__set__(instance, value)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 69, in __set__
    value = _convert(self.expected_type, value)
  File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 59, in _convert
    raise TypeError('expected ' + str(expected_type))
TypeError: expected <class 'float'>

I can only assume it's some form of type inference problem where the expected type may be derived from the cell type but it's actually a string and excel doesn't enforce this properly? I'm no real Excel spreadsheet OR openpyxl expert so I'm probably not much use. Hopefully this report is somewhat helpful though!

Comments (5)

  1. Charlie Clark

    hm, looks like Table 2 is the problem. Conditional formatting is more than a little tricky. The definition of the relative format says <cfvo type="num" val="$K$6"/> which is what we use as the basis for determining the type which we need if you ever want to inspect this in Python.

    You can workaround this fairly easily by setting self.expected_type = basestring in the ValueDescriptor setter. But before we make a general change, we'll need to think of the consequences because this would make inspecting existing files less useful.

  2. Max Edwards

    This has also been an issue for me, not sure if it's the conditional formatting or some other reason but I'm getting the same exception.

    I think that the package should gracefully handle these types of errors or at least give you an option of how to resolve it. Also telling you which sheet / cell the problem happened in would help. Right now it throws an exception and will not load the file and I don't know where the problem is.

    I'm not sure where I would set the expected_type to be basestring. The only line of code I have is: openpyxl.load_workbook('input.xlsx', data_only=True)

  3. Log in to comment