python-peps / pep-0290.txt

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
PEP: 290
Title: Code Migration and Modernization
Version: $Revision$
Last-Modified: $Date$
Author: Raymond Hettinger <python@rcn.com>
Status: Active
Type: Informational
Content-Type: text/x-rst
Created: 6-Jun-2002
Post-History:


Abstract
========

This PEP is a collection of procedures and ideas for updating Python
applications when newer versions of Python are installed.

The migration tips highlight possible areas of incompatibility and
make suggestions on how to find and resolve those differences.  The
modernization procedures show how older code can be updated to take
advantage of new language features.


Rationale
=========

This repository of procedures serves as a catalog or checklist of
known migration issues and procedures for addressing those issues.

Migration issues can arise for several reasons.  Some obsolete
features are slowly deprecated according to the guidelines in PEP 4
[1]_.  Also, some code relies on undocumented behaviors which are
subject to change between versions.  Some code may rely on behavior
which was subsequently shown to be a bug and that behavior changes
when the bug is fixed.

Modernization options arise when new versions of Python add features
that allow improved clarity or higher performance than previously
available.


Guidelines for New Entries
==========================

Developers with commit access may update this PEP directly.  Others
can send their ideas to a developer for possible inclusion.

While a consistent format makes the repository easier to use, feel
free to add or subtract sections to improve clarity.

Grep patterns may be supplied as tool to help maintainers locate code
for possible updates.  However, fully automated search/replace style
regular expressions are not recommended.  Instead, each code fragment
should be evaluated individually.

The contra-indications section is the most important part of a new
entry.  It lists known situations where the update SHOULD NOT be
applied.


Migration Issues
================

Comparison Operators Not a Shortcut for Producing 0 or 1
--------------------------------------------------------

Prior to Python 2.3, comparison operations returned 0 or 1 rather
than True or False.  Some code may have used this as a shortcut for
producing zero or one in places where their boolean counterparts are
not appropriate.  For example::
    
    def identity(m=1):
        """Create and m-by-m identity matrix"""
        return [[i==j for i in range(m)] for j in range(m)]

In Python 2.2, a call to identity(2) would produce::

    [[1, 0], [0, 1]]

In Python 2.3, the same call would produce::

    [[True, False], [False, True]]

Since booleans are a subclass of integers, the matrix would continue
to calculate normally, but it will not print as expected.  The list
comprehension should be changed to read::

    return [[int(i==j) for i in range(m)] for j in range(m)]

There are similiar concerns when storing data to be used by other
applications which may expect a number instead of True or False.


Modernization Procedures
========================

Procedures are grouped by the Python version required to be able to
take advantage of the modernization.

Python 2.4 or Later
-------------------

Inserting and Popping at the Beginning of Lists
'''''''''''''''''''''''''''''''''''''''''''''''

Python's lists are implemented to perform best with appends and pops on
the right.  Use of ``pop(0)`` or ``insert(0, x)`` triggers O(n) data
movement for the entire list.  To help address this need, Python 2.4
introduces a new container, ``collections.deque()`` which has efficient
append and pop operations on the both the left and right (the trade-off
is much slower getitem/setitem access).  The new container is especially
helpful for implementing data queues:

Pattern::

    c = list(data)   -->   c = collections.deque(data)
    c.pop(0)         -->   c.popleft()
    c.insert(0, x)   -->   c.appendleft()

Locating::

    grep pop(0 or
    grep insert(0

Simplifying Custom Sorts
''''''''''''''''''''''''

In Python 2.4, the ``sort`` method for lists and the new ``sorted``
built-in function both accept a ``key`` function for computing sort
keys.  Unlike the ``cmp`` function which gets applied to every
comparison, the key function gets applied only once to each record.
It is much faster than cmp and typically more readable while using
less code.  The key function also maintains the stability of the
sort (records with the same key are left in their original order.

Original code using a comparison function::

    names.sort(lambda x,y: cmp(x.lower(), y.lower()))

Alternative original code with explicit decoration::

    tempnames = [(n.lower(), n) for n in names]
    tempnames.sort()
    names = [original for decorated, original in tempnames]

Revised code using a key function::

    names.sort(key=str.lower)       # case-insensitive sort
                

Locating: ``grep sort *.py``

Replacing Common Uses of Lambda
'''''''''''''''''''''''''''''''

In Python 2.4, the ``operator`` module gained two new functions,
itemgetter() and attrgetter() that can replace common uses of
the ``lambda`` keyword.  The new functions run faster and
are considered by some to improve readability.

Pattern::

    lambda r: r[2]      -->  itemgetter(2)
    lambda r: r.myattr  -->  attrgetter('myattr')

Typical contexts::

    sort(studentrecords, key=attrgetter('gpa'))   # set a sort field
    map(attrgetter('lastname'), studentrecords)   # extract a field

Locating: ``grep lambda *.py``

Simplified Reverse Iteration
''''''''''''''''''''''''''''

Python 2.4 introduced the ``reversed`` builtin function for reverse
iteration.  The existing approaches to reverse iteration suffered
from wordiness, performance issues (speed and memory consumption),
and/or lack of clarity.  A preferred style is to express the
sequence in a forwards direction, apply ``reversed`` to the result,
and then loop over the resulting fast, memory friendly iterator.

Original code expressed with half-open intervals::

    for i in range(n-1, -1, -1):
        print seqn[i]

Alternative original code reversed in multiple steps::

    rseqn = list(seqn)
    rseqn.reverse()
    for value in rseqn:
        print value

Alternative original code expressed with extending slicing::

    for value in seqn[::-1]:
        print value

Revised code using the ``reversed`` function::

    for value in reversed(seqn):
        print value

Python 2.3 or Later
-------------------

Testing String Membership
'''''''''''''''''''''''''

In Python 2.3, for ``string2 in string1``, the length restriction on
``string2`` is lifted; it can now be a string of any length.  When
searching for a substring, where you don't care about the position of
the substring in the original string, using the ``in`` operator makes
the meaning clear.

Pattern::

    string1.find(string2) >= 0   -->  string2 in string1
    string1.find(string2) != -1  -->  string2 in string1

Replace apply() with a Direct Function Call
'''''''''''''''''''''''''''''''''''''''''''

In Python 2.3, apply() was marked for Pending Deprecation because it
was made obsolete by Python 1.6's introduction of * and ** in
function calls.  Using a direct function call was always a little
faster than apply() because it saved the lookup for the builtin.
Now, apply() is even slower due to its use of the warnings module.

Pattern::

    apply(f, args, kwds)  -->  f(*args, **kwds)

Note: The Pending Deprecation was removed from apply() in Python 2.3.3
since it creates pain for people who need to maintain code that works
with Python versions as far back as 1.5.2, where there was no
alternative to apply().  The function remains deprecated, however.


Python 2.2 or Later
-------------------

Testing Dictionary Membership
'''''''''''''''''''''''''''''

For testing dictionary membership, use the 'in' keyword instead of the
'has_key()' method.  The result is shorter and more readable.  The
style becomes consistent with tests for membership in lists.  The
result is slightly faster because ``has_key`` requires an attribute
search and uses a relatively expensive function call.

Pattern::

    if d.has_key(k):  -->  if k in d:

Contra-indications:

1. Some dictionary-like objects may not define a
   ``__contains__()`` method::

       if dictlike.has_key(k)

Locating: ``grep has_key``


Looping Over Dictionaries
'''''''''''''''''''''''''

Use the new ``iter`` methods for looping over dictionaries.  The
``iter`` methods are faster because they do not have to create a new
list object with a complete copy of all of the keys, values, or items.
Selecting only keys, values, or items (key/value pairs) as needed
saves the time for creating throwaway object references and, in the
case of items, saves a second hash look-up of the key.

Pattern::

    for key in d.keys():      -->  for key in d:
    for value in d.values():  -->  for value in d.itervalues():
    for key, value in d.items():
                              -->  for key, value in d.iteritems():

Contra-indications:

1. If you need a list, do not change the return type::

       def getids():  return d.keys()

2. Some dictionary-like objects may not define
   ``iter`` methods::

       for k in dictlike.keys():

3. Iterators do not support slicing, sorting or other operations::

       k = d.keys(); j = k[:]

4. Dictionary iterators prohibit modifying the dictionary::

       for k in d.keys(): del[k]


``stat`` Methods
''''''''''''''''

Replace ``stat`` constants or indices with new ``os.stat`` attributes
and methods.  The ``os.stat`` attributes and methods are not
order-dependent and do not require an import of the ``stat`` module.

Pattern::

    os.stat("foo")[stat.ST_MTIME]  -->  os.stat("foo").st_mtime
    os.stat("foo")[stat.ST_MTIME]  -->  os.path.getmtime("foo")

Locating: ``grep os.stat`` or ``grep stat.S``


Reduce Dependency on ``types`` Module
'''''''''''''''''''''''''''''''''''''

The ``types`` module is likely to be deprecated in the future.  Use
built-in constructor functions instead.  They may be slightly faster.

Pattern::

    isinstance(v, types.IntType)      -->  isinstance(v, int)
    isinstance(s, types.StringTypes)  -->  isinstance(s, basestring)

Full use of this technique requires Python 2.3 or later
(``basestring`` was introduced in Python 2.3), but Python 2.2 is
sufficient for most uses.

Locating: ``grep types *.py | grep import``


Avoid Variable Names that Clash with the ``__builtins__`` Module
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

In Python 2.2, new built-in types were added for ``dict`` and ``file``.
Scripts should avoid assigning variable names that mask those types.
The same advice also applies to existing builtins like ``list``.

Pattern::

    file = open('myfile.txt') --> f = open('myfile.txt')
    dict = obj.__dict__ --> d = obj.__dict__

Locating:  ``grep 'file ' *.py``


Python 2.1 or Later
-------------------

``whrandom`` Module Deprecated
''''''''''''''''''''''''''''''

All random-related methods have been collected in one place, the
``random`` module.

Pattern::

    import whrandom --> import random

Locating: ``grep whrandom``


Python 2.0 or Later
-------------------

String Methods
''''''''''''''

The string module is likely to be deprecated in the future.  Use
string methods instead.  They're faster too.

Pattern::

    import string ; string.method(s, ...)  -->  s.method(...)
    c in string.whitespace                 -->  c.isspace()

Locating: ``grep string *.py | grep import``


``startswith`` and ``endswith`` String Methods
''''''''''''''''''''''''''''''''''''''''''''''

Use these string methods instead of slicing.  No slice has to be
created and there's no risk of miscounting.

Pattern::

    "foobar"[:3] == "foo"   -->  "foobar".startswith("foo")
    "foobar"[-3:] == "bar"  -->  "foobar".endswith("bar")


The ``atexit`` Module
'''''''''''''''''''''

The atexit module supports multiple functions to be executed upon
program termination.  Also, it supports parameterized functions.
Unfortunately, its implementation conflicts with the sys.exitfunc
attribute which only supports a single exit function.  Code relying
on sys.exitfunc may interfere with other modules (including library
modules) that elect to use the newer and more versatile atexit module.

Pattern::

    sys.exitfunc = myfunc  -->  atexit.register(myfunc)


Python 1.5 or Later
-------------------

Class-Based Exceptions
''''''''''''''''''''''

String exceptions are deprecated, so derive from the ``Exception``
base class.  Unlike the obsolete string exceptions, class exceptions
all derive from another exception or the ``Exception`` base class.
This allows meaningful groupings of exceptions.  It also allows an
"``except Exception``" clause to catch all exceptions.

Pattern::

    NewError = 'NewError'  -->  class NewError(Exception): pass

Locating: Use PyChecker_.


All Python Versions
-------------------

Testing for ``None``
''''''''''''''''''''

Since there is only one ``None`` object, equality can be tested with
identity.  Identity tests are slightly faster than equality tests.
Also, some object types may overload comparison, so equality testing
may be much slower.

Pattern::

    if v == None  -->  if v is None:
    if v != None  -->  if v is not None:

Locating: ``grep '== None'`` or ``grep '!= None'``


References
==========

.. [1] PEP 4, Deprecation of Standard Modules, von Loewis
   (http://www.python.org/dev/peps/pep-0004/)

.. _PyChecker: http://pychecker.sourceforge.net/


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.