gcc 4.7.2 miscompiles stackless
(originally reported in Trac by @Anselm Kruis on 2013-04-07 09:39:02)
Compilers constantly improve. When I tried to build Stackless 2.7.4rc1 with gcc 4.7.2 the stackless unittests didn't terminate. Python went into an endless loop just before terminating.
It turned out that gcc overly optimised climb_stack_and_transfer. The compiler removed the alloca call and then performed a tail recursion optimisation. Pretty cool. :-)
Details: linux amd64, gcc 4.7.2, options "-O2". The relevant command line switches are
-foptimize-sibling-calls
-ftree-vrp
-ftree-dce
Because this optimisation does not depend on the architecture, I suspect that this problem affects
other architectures and stackless versions too.
I see two possibilities to fix this issue: - 1. Add some #pragmas or specific compiler switches to inhibit the optimisation. - 2. Use the pointer returned from alloca, i.e. store the pointer in a global variable.
I prefer option 2, because it is less compiler specific. And the overhead of an additional write is negligible.
Comments (3)
-
RMTEW FULL NAME reporter
-
RMTEW FULL NAME reporter
@Anselm Kruis on 2013-04-24 16:30:58 said:
I fixed it for 2.7-slp. Changeset [53f0e5446729]
To do: eventually port the fix to 3.x-slp
-
Anselm Kruis
- changed status to resolved
Resolved for 3.x since commit fac9fb5b115e.
- Log in to comment
@Christian Tismer on 2013-04-07 10:28:32 said:
Option 2 looks fine for me. We can make the global invisible enough so that we never use it. I prefer solutions that cannot be optimized in some future (presumably) ;-)