Move gc_stack_bottom before the increment of stacks_counter. This is needed because debug builds check the value of stack_counter and expect the old one.

         from pypy.module.cpyext.pyobject import Reference
         # we hope that malloc removal removes the newtuple() that is
         # inserted exactly here by the varargs specializer
+        llop.gc_stack_bottom(lltype.Void)   # marker for
         rffi.stackcounter.stacks_counter += 1
-        llop.gc_stack_bottom(lltype.Void)   # marker for
         retval = fatal_value
         boxed_args = ()