pypy / pypy / doc / cpython_differences.rst

Differences between PyPy and CPython

This page documents the few differences and incompatibilities between the PyPy Python interpreter and CPython. Some of these differences are "by design", since we think that there are cases in which the behaviour of CPython is buggy, and we do not want to copy bugs.

Differences that are not listed here should be considered bugs of PyPy.

Extension modules

List of extension modules that we support:

  • Supported as built-in modules (in `pypy/module/`_):

    __builtin__ __pypy__ _ast _bisect _codecs _collections _continuation _ffi _hashlib _io _locale _lsprof _md5 _minimal_curses _multiprocessing _random _rawffi _sha _socket _sre _ssl _warnings _weakref _winreg array binascii bz2 cStringIO clr cmath cpyext crypt errno exceptions fcntl gc imp itertools marshal math mmap operator oracle parser posix pyexpat select signal struct symbol sys termios thread time token unicodedata zipimport zlib

    When translated to Java or .NET, the list is smaller; see `pypy/config/pypyoption.py`_ for details.

    When translated on Windows, a few Unix-only modules are skipped, and the following module is built instead:

    _winreg

  • Supported by being rewritten in pure Python (possibly using ctypes): see the `lib_pypy/`_ directory. Examples of modules that we support this way: ctypes, cPickle, cmath, dbm, datetime... Note that some modules are both in there and in the list above; by default, the built-in module is used (but can be disabled at translation time).

The extension modules (i.e. modules written in C, in the standard CPython) that are neither mentioned above nor in `lib_pypy/`_ are not available in PyPy. (You may have a chance to use them anyway with cpyext.)

Subclasses of built-in types

Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden __getitem__() in a subclass of dict will not be called by e.g. the built-in get() method.

The above is true both in CPython and in PyPy. Differences can occur about whether a built-in function or method will call an overridden method of another object than self. In PyPy, they are generally always called, whereas not in CPython. For example, in PyPy, dict1.update(dict2) considers that dict2 is just a general mapping object, and will thus call overridden keys() and __getitem__() methods on it. So the following code prints 42 on PyPy but foo on CPython:

>>>> class D(dict):
....     def __getitem__(self, key):
....         return 42
....
>>>>
>>>> d1 = {}
>>>> d2 = D(a='foo')
>>>> d1.update(d2)
>>>> print d1['a']
42

Mutating classes of objects which are already used as dictionary keys

Consider the following snippet of code:

class X(object):
    pass

def __evil_eq__(self, other):
    print 'hello world'
    return False

def evil(y):
    d = {x(): 1}
    X.__eq__ = __evil_eq__
    d[y] # might trigger a call to __eq__?

In CPython, __evil_eq__ might be called, although there is no way to write a test which reliably calls it. It happens if y is not x and hash(y) == hash(x), where hash(x) is computed when x is inserted into the dictionary. If by chance the condition is satisfied, then __evil_eq__ is called.

PyPy uses a special strategy to optimize dictionaries whose keys are instances of user-defined classes which do not override the default __hash__, __eq__ and __cmp__: when using this strategy, __eq__ and __cmp__ are never called, but instead the lookup is done by identity, so in the case above it is guaranteed that __eq__ won't be called.

Note that in all other cases (e.g., if you have a custom __hash__ and __eq__ in y) the behavior is exactly the same as CPython.

Ignored exceptions

In many corner cases, CPython can silently swallow exceptions. The precise list of when this occurs is rather long, even though most cases are very uncommon. The most well-known places are custom rich comparison methods (like __eq__); dictionary lookup; calls to some built-in functions like isinstance().

Unless this behavior is clearly present by design and documented as such (as e.g. for hasattr()), in most cases PyPy lets the exception propagate instead.

Object Identity of Primitive Values, is and id

Object identity of primitive values works by value equality, not by identity of the wrapper. This means that x + 1 is x + 1 is always true, for arbitrary integers x. The rule applies for the following types:

  • int
  • float
  • long
  • complex

This change requires some changes to id as well. id fulfills the following condition: x is y <=> id(x) == id(y). Therefore id of the above types will return a value that is computed from the argument, and can thus be larger than sys.maxint (i.e. it can be an arbitrary long).

Miscellaneous

  • Hash randomization (-R) is ignored in PyPy. As documented in http://bugs.python.org/issue14621 , some of us believe it has no purpose in CPython either.
  • sys.setrecursionlimit(n) sets the limit only approximately, by setting the usable stack space to n * 768 bytes. On Linux, depending on the compiler settings, the default of 768KB is enough for about 1400 calls.
  • assignment to __class__ is limited to the cases where it works on CPython 2.5. On CPython 2.6 and 2.7 it works in a bit more cases, which are not supported by PyPy so far. (If needed, it could be supported, but then it will likely work in many more case on PyPy than on CPython 2.6/2.7.)
  • the __builtins__ name is always referencing the __builtin__ module, never a dictionary as it sometimes is in CPython. Assigning to __builtins__ has no effect.
  • directly calling the internal magic methods of a few built-in types with invalid arguments may have a slightly different result. For example, [].__add__(None) and (2).__add__(None) both return NotImplemented on PyPy; on CPython, only the later does, and the former raises TypeError. (Of course, []+None and 2+None both raise TypeError everywhere.) This difference is an implementation detail that shows up because of internal C-level slots that PyPy does not have.
  • the __dict__ attribute of new-style classes returns a normal dict, as opposed to a dict proxy like in CPython. Mutating the dict will change the type and vice versa. For builtin types, a dictionary will be returned that cannot be changed (but still looks and behaves like a normal dictionary).
  • the __len__ or __length_hint__ special methods are sometimes called by CPython to get a length estimate to preallocate internal arrays. So far, PyPy never calls __len__ for this purpose, and never calls __length_hint__ at all.
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.