Source

python-peps / pep-0419.txt

Full commit
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
PEP: 419
Title: Protecting cleanup statements from interruptions
Version: $Revision$
Last-Modified: $Date$
Author: Paul Colomiets <paul@colomiets.name>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 06-Apr-2012
Python-Version: 3.3


Abstract
========

This PEP proposes a way to protect Python code from being interrupted
inside a finally clause or during context manager cleanup.


Rationale
=========

Python has two nice ways to do cleanup.  One is a ``finally``
statement and the other is a context manager (usually called using a
``with`` statement).  However, neither is protected from interruption
by ``KeyboardInterrupt`` or ``GeneratorExit`` caused by
``generator.throw()``.  For example::

    lock.acquire()
    try:
        print('starting')
        do_something()
    finally:
        print('finished')
        lock.release()

If ``KeyboardInterrupt`` occurs just after the second ``print()``
call, the lock will not be released.  Similarly, the following code
using the ``with`` statement is affected::

    from threading import Lock

    class MyLock:

        def __init__(self):
            self._lock_impl = Lock()

        def __enter__(self):
            self._lock_impl.acquire()
            print("LOCKED")

        def __exit__(self):
            print("UNLOCKING")
            self._lock_impl.release()

    lock = MyLock()
    with lock:
        do_something

If ``KeyboardInterrupt`` occurs near any of the ``print()`` calls, the
lock will never be released.


Coroutine Use Case
------------------

A similar case occurs with coroutines.  Usually coroutine libraries
want to interrupt the coroutine with a timeout.  The
``generator.throw()`` method works for this use case, but there is no
way of knowing if the coroutine is currently suspended from inside a
``finally`` clause.

An example that uses yield-based coroutines follows.  The code looks
similar using any of the popular coroutine libraries Monocle [1]_,
Bluelet [2]_, or Twisted [3]_. ::

    def run_locked():
        yield connection.sendall('LOCK')
        try:
            yield do_something()
            yield do_something_else()
        finally:
            yield connection.sendall('UNLOCK')

    with timeout(5):
        yield run_locked()

In the example above, ``yield something`` means to pause executing the
current coroutine and to execute coroutine ``something`` until it
finishes execution.  Therefore the coroutine library itself needs to
maintain a stack of generators.  The ``connection.sendall()`` call waits
until the socket is writable and does a similar thing to what
``socket.sendall()`` does.

The ``with`` statement ensures that all code is executed within 5
seconds timeout.  It does so by registering a callback in the main
loop, which calls ``generator.throw()`` on the top-most frame in the
coroutine stack when a timeout happens.

The ``greenlets`` extension works in a similar way, except that it
doesn't need ``yield`` to enter a new stack frame.  Otherwise
considerations are similar.


Specification
=============

Frame Flag 'f_in_cleanup'
-------------------------

A new flag on the frame object is proposed.  It is set to ``True`` if
this frame is currently executing a ``finally`` clause.  Internally,
the flag must be implemented as a counter of nested finally statements
currently being executed.

The internal counter also needs to be incremented during execution of
the ``SETUP_WITH`` and ``WITH_CLEANUP`` bytecodes, and decremented
when execution for these bytecodes is finished.  This allows to also
protect ``__enter__()`` and ``__exit__()`` methods.


Function 'sys.setcleanuphook'
-----------------------------

A new function for the ``sys`` module is proposed.  This function sets
a callback which is executed every time ``f_in_cleanup`` becomes
false.  Callbacks get a frame object as their sole argument, so that
they can figure out where they are called from.

The setting is thread local and must be stored in the
``PyThreadState`` structure.


Inspect Module Enhancements
---------------------------

Two new functions are proposed for the ``inspect`` module:
``isframeincleanup()`` and ``getcleanupframe()``.

``isframeincleanup()``, given a frame or generator object as its sole
argument, returns the value of the ``f_in_cleanup`` attribute of a
frame itself or of the ``gi_frame`` attribute of a generator.

``getcleanupframe()``, given a frame object as its sole argument,
returns the innermost frame which has a true value of
``f_in_cleanup``, or ``None`` if no frames in the stack have a nonzero
value for that attribute.  It starts to inspect from the specified
frame and walks to outer frames using ``f_back`` pointers, just like
``getouterframes()`` does.


Example
=======

An example implementation of a SIGINT handler that interrupts safely
might look like::

    import inspect, sys, functools

    def sigint_handler(sig, frame):
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(functools.partial(sigint_handler, 0))

A coroutine example is out of scope of this document, because its
implementation depends very much on a trampoline (or main loop) used
by coroutine library.


Unresolved Issues
=================

Interruption Inside With Statement Expression
---------------------------------------------

Given the statement ::

    with open(filename):
        do_something()

Python can be interrupted after ``open()`` is called, but before the
``SETUP_WITH`` bytecode is executed.  There are two possible
decisions:

* Protect ``with`` expressions.  This would require another bytecode,
  since currently there is no way of recognizing the start of the
  ``with`` expression.

* Let the user write a wrapper if he considers it important for the
  use-case.  A safe wrapper might look like this::

      class FileWrapper(object):

          def __init__(self, filename, mode):
              self.filename = filename
              self.mode = mode

          def __enter__(self):
              self.file = open(self.filename, self.mode)

          def __exit__(self):
              self.file.close()

  Alternatively it can be written using the ``contextmanager()``
  decorator::

      @contextmanager
      def open_wrapper(filename, mode):
          file = open(filename, mode)
          try:
              yield file
          finally:
              file.close()

  This code is safe, as the first part of the generator (before yield)
  is executed inside the ``SETUP_WITH`` bytecode of the caller.


Exception Propagation
---------------------

Sometimes a ``finally`` clause or an ``__enter__()``/``__exit__()``
method can raise an exception.  Usually this is not a problem, since
more important exceptions like ``KeyboardInterrupt`` or ``SystemExit``
should be raised instead.  But it may be nice to be able to keep the
original exception inside a ``__context__`` attribute.  So the cleanup
hook signature may grow an exception argument::

    def sigint_handler(sig, frame)
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(retry_sigint)

    def retry_sigint(frame, exception=None):
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt() from exception

.. note::

   There is no need to have three arguments like in the ``__exit__``
   method since there is a ``__traceback__`` attribute in exception in
   Python 3.

However, this will set the ``__cause__`` for the exception, which is
not exactly what's intended.  So some hidden interpreter logic may be
used to put a ``__context__`` attribute on every exception raised in a
cleanup hook.


Interruption Between Acquiring Resource and Try Block
-----------------------------------------------------

The example from the first section is not totally safe.  Let's take a
closer look::

    lock.acquire()
    try:
        do_something()
    finally:
        lock.release()

The problem might occur if the code is interrupted just after
``lock.acquire()`` is executed but before the ``try`` block is
entered.

There is no way the code can be fixed unmodified.  The actual fix
depends very much on the use case.  Usually code can be fixed using a
``with`` statement::

    with lock:
        do_something()

However, for coroutines one usually can't use the ``with`` statement
because you need to ``yield`` for both the acquire and release
operations.  So the code might be rewritten like this::

    try:
        yield lock.acquire()
        do_something()
    finally:
        yield lock.release()

The actual locking code might need more code to support this use case,
but the implementation is usually trivial, like this: check if the
lock has been acquired and unlock if it is.


Handling EINTR Inside a Finally
-------------------------------

Even if a signal handler is prepared to check the ``f_in_cleanup``
flag, ``InterruptedError`` might be raised in the cleanup handler,
because the respective system call returned an ``EINTR`` error.  The
primary use cases are prepared to handle this:

* Posix mutexes never return ``EINTR``

* Networking libraries are always prepared to handle ``EINTR``

* Coroutine libraries are usually interrupted with the ``throw()``
  method, not with a signal

The platform-specific function ``siginterrupt()`` might be used to
remove the need to handle ``EINTR``.  However, it may have hardly
predictable consequences, for example ``SIGINT`` a handler is never
called if the main thread is stuck inside an IO routine.

A better approach would be to have the code, which is usually used in
cleanup handlers, be prepared to handle ``InterruptedError``
explicitly.  An example of such code might be a file-based lock
implementation.

``signal.pthread_sigmask`` can be used to block signals inside
cleanup handlers which can be interrupted with ``EINTR``.


Setting Interruption Context Inside Finally Itself
--------------------------------------------------

Some coroutine libraries may need to set a timeout for the finally
clause itself.  For example::

    try:
        do_something()
    finally:
        with timeout(0.5):
            try:
                yield do_slow_cleanup()
            finally:
                yield do_fast_cleanup()

With current semantics, timeout will either protect the whole ``with``
block or nothing at all, depending on the implementation of each
library.  What the author intended is to treat ``do_slow_cleanup`` as
ordinary code, and ``do_fast_cleanup`` as a cleanup (a
non-interruptible one).

A similar case might occur when using greenlets or tasklets.

This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
by calling a cleanup hook on each decrement.  A coroutine library may
then remember the value at timeout start, and compare it on each hook
execution.

But in practice, the example is considered to be too obscure to take
into account.


Modifying KeyboardInterrupt
---------------------------

It should be decided if the default ``SIGINT`` handler should be
modified to use the described mechanism.  The initial proposition is
to keep old behavior, for two reasons:

* Most application do not care about cleanup on exit (either they do
  not have external state, or they modify it in crash-safe way).

* Cleanup may take too much time, not giving user a chance to
  interrupt an application.

The latter case can be fixed by allowing an unsafe break if a
``SIGINT`` handler is called twice, but it seems not worth the
complexity.


Alternative Python Implementations Support
==========================================

We consider ``f_in_cleanup`` an implementation detail.  The actual
implementation may have some fake frame-like object passed to signal
handler, cleanup hook and returned from ``getcleanupframe()``.  The
only requirement is that the ``inspect`` module functions work as
expected on these objects.  For this reason, we also allow to pass a
generator object to the ``isframeincleanup()`` function, which removes
the need to use the ``gi_frame`` attribute.

It might be necessary to specify that ``getcleanupframe()`` must
return the same object that will be passed to cleanup hook at the next
invocation.


Alternative Names
=================

The original proposal had a ``f_in_finally`` frame attribute, as the
original intention was to protect ``finally`` clauses.  But as it grew
up to protecting ``__enter__`` and ``__exit__`` methods too, the
``f_in_cleanup`` name seems better.  Although the ``__enter__`` method
is not a cleanup routine, it at least relates to cleanup done by
context managers.

``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
``get_cleanup_frame``, although they follow the naming convention of
their respective modules.


Alternative Proposals
=====================

Propagating 'f_in_cleanup' Flag Automatically
---------------------------------------------

This can make ``getcleanupframe()`` unnecessary.  But for yield-based
coroutines you need to propagate it yourself.  Making it writable
leads to somewhat unpredictable behavior of ``setcleanuphook()``.


Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
--------------------------------------------

These bytecodes can be used to protect the expression inside the
``with`` statement, as well as making counter increments more explicit
and easy to debug (visible inside a disassembly).  Some middle ground
might be chosen, like ``END_FINALLY`` and ``SETUP_WITH`` implicitly
decrementing the counter (``END_FINALLY`` is present at end of every
``with`` suite).

However, adding new bytecodes must be considered very carefully.


Expose 'f_in_cleanup' as a Counter
----------------------------------

The original intention was to expose a minimum of needed
functionality.  However, as we consider the frame flag
``f_in_cleanup`` an implementation detail, we may expose it as a
counter.

Similarly, if we have a counter we may need to have the cleanup hook
called on every counter decrement.  It's unlikely to have much
performance impact as nested finally clauses are an uncommon case.


Add code object flag 'CO_CLEANUP'
---------------------------------

As an alternative to set the flag inside the ``SETUP_WITH`` and
``WITH_CLEANUP`` bytecodes, we can introduce a flag ``CO_CLEANUP``.
When the interpreter starts to execute code with ``CO_CLEANUP`` set,
it sets ``f_in_cleanup`` for the whole function body.  This flag is
set for code objects of ``__enter__`` and ``__exit__`` special
methods.  Technically it might be set on functions called
``__enter__`` and ``__exit__``.

This seems to be less clear solution.  It also covers the case where
``__enter__`` and ``__exit__`` are called manually.  This may be
accepted either as a feature or as an unnecessary side-effect (or,
though unlikely, as a bug).

It may also impose a problem when ``__enter__`` or ``__exit__``
functions are implemented in C, as there is no code object to check
for the ``f_in_cleanup`` flag.


Have Cleanup Callback on Frame Object Itself
--------------------------------------------

The frame object may be extended to have a ``f_cleanup_callback``
member which is called when ``f_in_cleanup`` is reset to 0.  This
would help to register different callbacks to different coroutines.

Despite its apparent beauty, this solution doesn't add anything, as
the two primary use cases are:

* Setting the callback in a signal handler.  The callback is
  inherently a single one for this case.

* Use a single callback per loop for the coroutine use case.  Here, in
  almost all cases, there is only one loop per thread.


No Cleanup Hook
---------------

The original proposal included no cleanup hook specification, as there
are a few ways to achieve the same using current tools:

* Using ``sys.settrace()`` and the ``f_trace`` callback.  This may
  impose some problem to debugging, and has a big performance impact
  (although interrupting doesn't happen very often).

* Sleeping a bit more and trying again.  For a coroutine library this
  is easy.  For signals it may be achieved using ``signal.alert``.

Both methods are considered too impractical and a way to catch exit
from ``finally`` clauses is proposed.


References
==========

.. [1] Monocle
   https://github.com/saucelabs/monocle

.. [2] Bluelet
   https://github.com/sampsyo/bluelet

.. [3] Twisted: inlineCallbacks
   http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html

.. [4] Original discussion
   http://mail.python.org/pipermail/python-ideas/2012-April/014705.html

.. [5] Issue #14730: Implementation of the PEP 419
   http://bugs.python.org/issue14730


Copyright
=========

This document has been placed in the public domain.



..
  Local Variables:
  mode: indented-text
  indent-tabs-mode: nil
  sentence-end-double-space: t
  fill-column: 70
  coding: utf-8
  End: