(I know very little about OO-XML so this is mostly guess work, sorry.)
Some of our workbooks raise
load_workbook()ing under 1.9. This appears to be because they contain string values that start with a space and the
ILLEGAL_CHARACTERS_RE pattern is both incorrect and that it is used with
match instead of
In 1.9 we have
ILLEGAL_CHARACTERS_RE = re.compile('|'.join(chr(x) for x in range(33)))
i.e. the pattern is
I suspect this pattern is wrong because 0x20 is space. It seems unlikely that strings should never contain spaces. I suggest
[\000-\037] as a replacement.
Secondly this problem only appears on strings that start with things that match this pattern because the pattern is checked like this
and the Python docs say this about
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance.
It seems likely the behaviour that was intended is that of
Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding MatchObject instance.
next(re.finditer, None) since only care if there are greater than zero matches?