Source

python-peps / pep-0334.txt

Full commit
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
PEP: 334
Title: Simple Coroutines via SuspendIteration
Version: $Revision$
Last-Modified: $Date$
Author: Clark C. Evans <cce@clarkevans.com>
Status: Withdrawn
Type: Standards Track
Content-Type: text/x-rst
Created: 26-Aug-2004
Python-Version: 3.0
Post-History: 


Abstract
========

Asynchronous application frameworks such as Twisted [1]_ and Peak
[2]_, are based on a cooperative multitasking via event queues or
deferred execution.  While this approach to application development
does not involve threads and thus avoids a whole class of problems
[3]_, it creates a different sort of programming challenge.  When an
I/O operation would block, a user request must suspend so that other
requests can proceed.  The concept of a coroutine [4]_ promises to
help the application developer grapple with this state management
difficulty.

This PEP proposes a limited approach to coroutines based on an
extension to the iterator protocol [5]_.  Currently, an iterator may
raise a StopIteration exception to indicate that it is done producing
values.  This proposal adds another exception to this protocol,
SuspendIteration, which indicates that the given iterator may have
more values to produce, but is unable to do so at this time.


Rationale
=========

There are two current approaches to bringing co-routines to Python.
Christian Tismer's Stackless [6]_ involves a ground-up restructuring
of Python's execution model by hacking the 'C' stack.  While this
approach works, its operation is hard to describe and keep portable. A
related approach is to compile Python code to Parrot [7]_, a
register-based virtual machine, which has coroutines.  Unfortunately,
neither of these solutions is portable with IronPython (CLR) or Jython
(JavaVM).

It is thought that a more limited approach, based on iterators, could
provide a coroutine facility to application programmers and still be
portable across runtimes.

* Iterators keep their state in local variables that are not on the
  "C" stack.  Iterators can be viewed as classes, with state stored in
  member variables that are persistent across calls to its next()
  method.

* While an uncaught exception may terminate a function's execution, an
  uncaught exception need not invalidate an iterator.  The proposed
  exception, SuspendIteration, uses this feature.  In other words,
  just because one call to next() results in an exception does not
  necessarily need to imply that the iterator itself is no longer
  capable of producing values.

There are four places where this new exception impacts:

* The simple generator [8]_ mechanism could be extended to safely
  'catch' this SuspendIteration exception, stuff away its current
  state, and pass the exception on to the caller.

* Various iterator filters [9]_ in the standard library, such as
  itertools.izip should be made aware of this exception so that it can
  transparently propagate SuspendIteration.

* Iterators generated from I/O operations, such as a file or socket
  reader, could be modified to have a non-blocking variety.  This
  option would raise a subclass of SuspendIteration if the requested
  operation would block.

* The asyncore library could be updated to provide a basic 'runner'
  that pulls from an iterator; if the SuspendIteration exception is
  caught, then it moves on to the next iterator in its runlist [10]_.
  External frameworks like Twisted would provide alternative
  implementations, perhaps based on FreeBSD's kqueue or Linux's epoll.

While these may seem dramatic changes, it is a very small amount of
work compared with the utility provided by continuations.


Semantics
=========

This section will explain, at a high level, how the introduction of
this new SuspendIteration exception would behave.


Simple Iterators
----------------

The current functionality of iterators is best seen with a simple
example which produces two values 'one' and 'two'. ::

    class States:

        def __iter__(self):
            self._next = self.state_one
            return self

        def next(self):
            return self._next()

        def state_one(self):
            self._next = self.state_two
            return "one"

        def state_two(self):
            self._next = self.state_stop
            return "two"

        def state_stop(self):
            raise StopIteration

    print list(States())

An equivalent iteration could, of course, be created by the
following generator::

    def States():
        yield 'one'
        yield 'two'

    print list(States())


Introducing SuspendIteration
----------------------------

Suppose that between producing 'one' and 'two', the generator above
could block on a socket read.  In this case, we would want to raise
SuspendIteration to signal that the iterator is not done producing,
but is unable to provide a value at the current moment. ::

    from random import randint
    from time import sleep

    class SuspendIteration(Exception):
          pass

    class NonBlockingResource:

        """Randomly unable to produce the second value"""

        def __iter__(self):
            self._next = self.state_one
            return self

        def next(self):
            return self._next()

        def state_one(self):
            self._next = self.state_suspend
            return "one"

        def state_suspend(self):
            rand = randint(1,10)
            if 2 == rand:
                self._next = self.state_two
                return self.state_two()
            raise SuspendIteration()

        def state_two(self):
            self._next = self.state_stop
            return "two"

        def state_stop(self):
            raise StopIteration

    def sleeplist(iterator, timeout = .1):
        """
        Do other things (e.g. sleep) while resource is
        unable to provide the next value
        """
        it = iter(iterator)
        retval = []
        while True:
            try:
                retval.append(it.next())
            except SuspendIteration:
                sleep(timeout)
                continue
            except StopIteration:
                break
        return retval

    print sleeplist(NonBlockingResource())

In a real-world situation, the NonBlockingResource would be a file
iterator, socket handle, or other I/O based producer.  The sleeplist
would instead be an async reactor, such as those found in asyncore or
Twisted.  The non-blocking resource could, of course, be written as a
generator::

    def NonBlockingResource():
        yield "one"
        while True:
            rand = randint(1,10)
            if 2 == rand:
                break
            raise SuspendIteration()
        yield "two"

It is not necessary to add a keyword, 'suspend', since most real
content generators will not be in application code, they will be in
low-level I/O based operations.  Since most programmers need not be
exposed to the SuspendIteration() mechanism, a keyword is not needed.


Application Iterators
---------------------

The previous example is rather contrived, a more 'real-world' example
would be a web page generator which yields HTML content, and pulls
from a database.  Note that this is an example of neither the
'producer' nor the 'consumer', but rather of a filter. ::

    def ListAlbums(cursor):
        cursor.execute("SELECT title, artist FROM album")
        yield '<html><body><table><tr><td>Title</td><td>Artist</td></tr>'
        for (title, artist) in cursor:
            yield '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
        yield '</table></body></html>'

The problem, of course, is that the database may block for some time
before any rows are returned, and that during execution, rows may be
returned in blocks of 10 or 100 at a time. Ideally, if the database
blocks for the next set of rows, another user connection could be
serviced.  Note the complete absence of SuspendIterator in the above
code.  If done correctly, application developers would be able to
focus on functionality rather than concurrency issues.

The iterator created by the above generator should do the magic
necessary to maintain state, yet pass the exception through to a
lower-level async framework.  Here is an example of what the
corresponding iterator would look like if coded up as a class::

    class ListAlbums:

        def __init__(self, cursor):
            self.cursor = cursor

        def __iter__(self):
            self.cursor.execute("SELECT title, artist FROM album")
            self._iter = iter(self._cursor)
            self._next = self.state_head
            return self

        def next(self):
            return self._next()

        def state_head(self):
            self._next = self.state_cursor
            return "<html><body><table><tr><td>\
                    Title</td><td>Artist</td></tr>"

        def state_tail(self):
            self._next = self.state_stop
            return "</table></body></html>"

        def state_cursor(self):
            try:
                (title,artist) = self._iter.next()
                return '<tr><td>%s</td><td>%s</td></tr>' % (title, artist)
            except StopIteration:
                self._next = self.state_tail
                return self.next()
            except SuspendIteration:
                # just pass-through
                raise

        def state_stop(self):
            raise StopIteration


Complicating Factors
--------------------

While the above example is straight-forward, things are a bit more
complicated if the intermediate generator 'condenses' values, that is,
it pulls in two or more values for each value it produces. For
example, ::

    def pair(iterLeft,iterRight):
        rhs = iter(iterRight)
        lhs = iter(iterLeft)
        while True:
           yield (rhs.next(), lhs.next())

In this case, the corresponding iterator behavior has to be a bit more
subtle to handle the case of either the right or left iterator raising
SuspendIteration.  It seems to be a matter of decomposing the
generator to recognize intermediate states where a SuspendIterator
exception from the producing context could happen. ::

    class pair:

        def __init__(self, iterLeft, iterRight):
            self.iterLeft = iterLeft
            self.iterRight = iterRight

        def __iter__(self):
            self.rhs = iter(iterRight)
            self.lhs = iter(iterLeft)
            self._temp_rhs = None
            self._temp_lhs = None
            self._next = self.state_rhs
            return self

        def next(self):
            return self._next()

        def state_rhs(self):
            self._temp_rhs = self.rhs.next()
            self._next = self.state_lhs
            return self.next()

        def state_lhs(self):
            self._temp_lhs = self.lhs.next()
            self._next = self.state_pair
            return self.next()

        def state_pair(self):
            self._next = self.state_rhs
            return (self._temp_rhs, self._temp_lhs)

This proposal assumes that a corresponding iterator written using
this class-based method is possible for existing generators.  The
challenge seems to be the identification of distinct states within
the generator where suspension could occur.


Resource Cleanup
----------------

The current generator mechanism has a strange interaction with
exceptions where a 'yield' statement is not allowed within a
try/finally block.  The SuspendIterator exception provides another
similar issue.  The impacts of this issue are not clear. However it
may be that re-writing the generator into a state machine, as the
previous section did, could resolve this issue allowing for the
situation to be no-worse than, and perhaps even removing the
yield/finally situation.  More investigation is needed in this area.


API and Limitations
-------------------

This proposal only covers 'suspending' a chain of iterators, and does
not cover (of course) suspending general functions, methods, or "C"
extension function.  While there could be no direct support for
creating generators in "C" code, native "C" iterators which comply
with the SuspendIterator semantics are certainly possible.


Low-Level Implementation
========================

The author of the PEP is not yet familiar with the Python execution
model to comment in this area.


References
==========

.. [1] Twisted
   (http://twistedmatrix.com)

.. [2] Peak
   (http://peak.telecommunity.com)

.. [3] C10K
   (http://www.kegel.com/c10k.html)

.. [4] Coroutines
   (http://c2.com/cgi/wiki?CallWithCurrentContinuation)

.. [5] PEP 234, Iterators
   (http://www.python.org/dev/peps/pep-0234/)

.. [6] Stackless Python
   (http://stackless.com)

.. [7] Parrot /w coroutines
   (http://www.sidhe.org/~dan/blog/archives/000178.html)

.. [8] PEP 255, Simple Generators
   (http://www.python.org/dev/peps/pep-0255/)

.. [9] itertools - Functions creating iterators
   (http://docs.python.org/library/itertools.html)

.. [10] Microthreads in Python, David Mertz
   (http://www-106.ibm.com/developerworks/linux/library/l-pythrd.html)


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End: