Parse function identifies RANGE incorrectly
My program is supposed to take each cell with a formula in the worksheet and "explode" it. This means it parses the excel formula and replaces each cell reference (each RANGE, ie A22) with the value of that cell surrounded by parenthesis. So if excel cell A22=A3A5, and cell B10 value is =A22+10, B10 becomes = (A3A5)+10. The program then keeps looping and replacing the cell values until they're just raw numbers and functions.
This works well until until the program gets to a particular cell in the excel file I'm working on, C34.
C34=IF(B1_Data,IF(B1_Suf,IF(B1_05,BD34,""),IF(AND(B2_Data,B1B2_Suf),IF(B1_05,BD34,""),BD30)),BD29)
The first and second iteration goes well...
'C34_lvl0': 'IF($BA$29,IF($BA$31,IF($BA$33,BD34,""),IF(AND($BA$30,$BA$32),IF($BA$33,BD34,""),BD30)),BD29)'
'C34_lvl1':
'=IF((NOT(OR(I29="",I30="",I31="",I31=10.0001,I32=""))),IF((AH32<0),IF((AND(#REF!>0,#REF!<0.5)),(LDddTaq MASS TOO SMALL  INCREASE BUILD SIZE),""),IF(AND((NOT(OR(I36="",I37="",I38="",I38=10.0001,I39=""))),(AH39<0)),IF((AND(#REF!>0,#REF!<0.5)),(LDddTaq MASS TOO SMALL  INCREASE BUILD SIZE),""),(INSUFFICIENT LDddTaq TO SUPPORT BUILD, USE 2ND BATCH))),(ENTER ACTUAL LDddTaq POLYMERASE VALUES FOR LOT #1))
But then I get the below error with the next iteration. When I parse the above (lvl1) formula, I see it is showing (NOT(OR(I29="",I30="",I31="",I31=10.0001,I32=""))) as a single range. This does not happen in other excel IF functions.
Anyone know what is going on or a work around? Is this a bug? I'm using openpyxl version 2.4.9 and python 3.6. I attached my code if it helps.
The Error:
Traceback (most recent call last): File "C:\Python\20180818HW_ValidationAid_v016.py", line 186, in tok=Tokenizer(tFinal) #dissect each cell, do the spiel File "C:\Python\lib\sitepackages\openpyxl\formula\tokenizer.py", line 53, in init self._parse() File "C:\Python\lib\sitepackages\openpyxl\formula\tokenizer.py", line 86, in _parse self.offset += dispatchercurr_char File "C:\Python\lib\sitepackages\openpyxl\formula\tokenizer.py", line 157, in _parse_error (self.offset, self.formula)) openpyxl.formula.tokenizer.TokenizerError: Invalid error code at position 402 in '=IF((NOT(OR(I29="",I30="",I31="",I31=10.0001,I32=""))),IF((AH32<0),IF((AND(#REF!>0,#REF!<0.5)),(LDddTaq MASS TOO SMALL  INCREASE BUILD SIZE),""),IF(AND((NOT(OR(I36="",I37="",I38="",I38=10.0001,I39=""))),(AH39<0)),IF((AND(#REF!>0,#REF!<0.5)),(LDddTaq MASS TOO SMALL  INCREASE BUILD SIZE),""),(INSUFFICIENT LDddTaq TO SUPPORT BUILD, USE 2ND BATCH))),(ENTER ACTUAL LDddTaq POLYMERASE VALUES FOR LOT #1))'
Comments (6)


Looking again at this and I don't really think it's a bug. The tokeniser can happily decompose and then recompose the formula.
Are you trying to use openpyxl to evaluate formulae? If so, the library isn't suitable for this. You're better of using something like Pycel which purports to do this. I suspect that there are gotcha related to the order of evaluation in this kind of thing.

reporter I was trying to use it evaluating formulas, yes. Specifically it would parse one formula, then replace all precedent cells mentioned with each precedent cell's value. It would do this until all final output cells were one big equation of static values. This would be successful for a few rounds, then hit this little snag I described above. If you don't think it's a bug, I'll believe you.
Thank you for looking into it again! I'll check out Pycel, haven't heard of it before.

You probably have some kind of circular reference in there, the applications have ways of dealing with that. If the formula isn't already evaluated in there then just open the file with Excel or OpenOffice and read it in dataonly mode with Excel.
FWIW I think you're approach is flawed. Much better to transliterate the formulae to Python equivalents using the tokeniser.

 changed status to resolved
Trying to use the library for something it's not really suited.

 removed version
Removing version: 2.4.x (automated comment)
 Log in to comment
Thanks for the report. The tokeniser isn't my work so I'm unsure when this can be fixed.