Armin Rigo committed 67e8823

Rewrite most of "Writing extension modules for pypy".

Comments (0)

Files changed (1)


 This document tries to explain how to interface the PyPy python interpreter
 with any external library.
-Note: We try to describe state-of-the art, but it
-might fade out of date as this is the front on which things are changing
-in pypy rapidly.
+Right now, there are the following possibilities of providing
+third-party modules for the PyPy python interpreter (in order of
+* Write them in pure Python and use CFFI_.
-Right now, there are three possibilities of providing third-party modules
-for the PyPy python interpreter (in order of usefulness):
+* Write them in pure Python and use ctypes_.
-* Write them in pure python and use ctypes, see ctypes_
-  section
+* Write them in C++ and bind them through Reflex_.
-* Write them in pure python and use direct libffi low-level bindings, See
-  \_ffi_ module description.
+* Write them in as `RPython mixed modules`_.
-* Write them in RPython as mixedmodule_, using *rffi* as bindings.
-* Write them in C++ and bind them through Reflex_
-.. _ctypes: #CTypes
-.. _\_ffi: #LibFFI
-.. _mixedmodule: #Mixed Modules
+CFFI__ is the recommended way.  It is a way to write pure Python code
+that accesses C libraries.  The idea is to support either ABI- or
+API-level access to C --- so that you can sanely access C libraries
+without depending on details like the exact field order in the C
+structures or the numerical value of all the constants.  It works on
+both CPython (as a separate ``pip install cffi``) and on PyPy, where it
+is included by default.
+PyPy's JIT does a quite reasonable job on the Python code that call C
+functions or manipulate C pointers with CFFI.  (As of PyPy 2.2.1, it
+could still be improved, but is already good.)
+See the documentation here__.
+.. __:
+.. __:
-The ctypes module in PyPy is ready to use.
-It's goal is to be as-compatible-as-possible with the
-`CPython ctypes`_ version. Right now it's able to support large examples,
-such as pyglet. PyPy is planning to have a 100% compatible ctypes
-implementation, without the CPython C-level API bindings (so it is very
-unlikely that direct object-manipulation trickery through this API will work).
+The goal of the ctypes module of PyPy is to be as compatible as possible
+with the `CPython ctypes`_ version.  It works for large examples, such
+as pyglet.  PyPy's implementation is not strictly 100% compatible with
+CPython, but close enough for most cases.
-We also provide a `ctypes-configure`_ for overcoming the platform dependencies,
-not relying on the ctypes codegen. This tool works by querying gcc about
-platform-dependent details (compiling small snippets of C code and running
-them), so it'll benefit not pypy-related ctypes-based modules as well.
+We also used to provide ``ctypes-configure`` for some API-level access.
+This is now viewed as a precursor of CFFI, which you should use instead.
+More (but older) information is available here__.
+Also, ctypes' performance is not as good as CFFI's.
-ctypes call are optimized by the JIT and the resulting machine code contains a
-direct call to the target C function.  However, due to the very dynamic nature
-of ctypes, some overhead over a bare C call is still present, in particular to
-check/convert the types of the parameters.  Moreover, even if most calls are
-optimized, some cannot and thus need to follow the slow path, not optimized by
-the JIT.
+.. _`CPython ctypes`:
+.. __: ctypes-implementation.html
-.. _`ctypes-configure`: ctypes-implementation.html#ctypes-configure
-.. _`CPython ctypes`:
+PyPy implements ctypes as pure Python code around two built-in modules
+called ``_ffi`` and ``_rawffi``, which give a very low-level binding to
+the C library libffi_.  Nowadays it is not recommended to use directly
+these two modules.
+.. _libffi:
-Stable, CPython-compatible API.  Most calls are fast, optimized by JIT.
-Problems with platform-dependency (although we partially solve
-those). Although the JIT optimizes ctypes calls, some overhead is still
-present.  The slow-path is very slow.
-Mostly in order to be able to write a ctypes module, we developed a very
-low-level libffi bindings called ``_ffi``. (libffi is a C-level library for dynamic calling,
-which is used by CPython ctypes). This library provides stable and usable API,
-although it's API is a very low-level one. It does not contain any
-magic.  It is also optimized by the JIT, but has much less overhead than ctypes.
-It Works. Probably more suitable for a delicate code where ctypes magic goes
-in a way.  All calls are optimized by the JIT, there is no slow path as in
-It combines disadvantages of using ctypes with disadvantages of using mixed
-modules. CPython-incompatible API, very rough and low-level.
-Mixed Modules
-This is the most advanced and powerful way of writing extension modules.
-It has some serious disadvantages:
-* a mixed module needs to be written in RPython, which is far more
-  complicated than Python (XXX link)
-* due to lack of separate compilation (as of July 2011), each
-  compilation-check requires to recompile whole PyPy python interpreter,
-  which takes 0.5-1h. We plan to solve this at some point in near future.
-* although rpython is a garbage-collected language, the border between
-  C and RPython needs to be managed by hand (each object that goes into the
-  C level must be explicitly freed).
-Some documentation is available `here`_
-.. _`here`: rffi.html
-XXX we should provide detailed docs about lltype and rffi, especially if we
-    want people to follow that way.
-This method is still experimental and is being exercised on a branch,
-`reflex-support`_, which adds the `cppyy`_ module.
+This method is still experimental.  It adds the `cppyy`_ module.
 The method works by using the `Reflex package`_ to provide reflection
 information of the C++ code, which is then used to automatically generate
 bindings at runtime.
 to work around it in python or with a C++ helper function.
 Although Reflex works on various platforms, the bindings with PyPy have only
 been tested on Linux.
+RPython Mixed Modules
+This is the internal way to write built-in extension modules in PyPy.
+It cannot be used by any 3rd-party module: the extension modules are
+*built-in*, not independently loadable DLLs.
+This is reserved for special cases: it gives direct access to e.g. the
+details of the JIT, allowing us to tweak its interaction with user code.
+This is how the numpy module is being developed.