Source / performance.html

<!DOCTYPE html>
	<title>PyPy :: Performance</title>
	<meta http-equiv="content-language" content="en" />
	<meta http-equiv="content-type" content="text/html; charset=utf-8" />
	<meta name="author" content="PyPy Team" />
	<meta name="description" content="PyPy" />
	<meta name="copyright" content="MIT" />
	<meta name="document-rating" content="general" />
	<link rel="stylesheet" type="text/css" media="screen" title="default" href="css/site.css" />
	<link rel="alternate" type="application/rss+xml" title="RSS Feed for PyPy" href="" />
  <link rel="stylesheet" type="text/css" href="css/jquery-ui-1.8.14.custom.css" />
	<script type="text/javascript" src=""></script>
	<script type="text/javascript">try{Typekit.load();}catch(e){}</script>
	<script type="text/javascript" src=""></script>
  <script type="text/javascript" src="js/jquery-ui-1.8.14.custom.min.js"></script>
  <script type="text/javascript" src="js/detect.js"></script>
  <script type="text/javascript" src="js/script2.js?bust=1"></script>
<script type="text/javascript">
	var _gaq = [['_setAccount', 'UA-7778406-3'], ['_trackPageview']];
	if (document.location.protocol !== 'file:') {
		(function() {
			var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
			ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '';
			(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(ga);
<div id="body-outer"><div id="body-inner"><div id="body" class="clearfix">
<div id="header">
	<div id="menu-follow">
		<div><a href="" title="Follow the conversation on Twitter"><img src="" alt="Follow the conversation on Twitter" width="14px" height="14px" /></a></div>
    <div><a href=""><img src="" width="14px" height="14px" /></a></div>
		<div><a href="" title="Subscribe to the RSS Feed"><img src="" alt="Subscribe to the RSS Feed" width="14px" height="14px" /></a></div>
	<div id="logo"><a href=""><img src="image/pypy-logo.png" alt="PyPy" height="110px" /></a></div>
	<hr class="clear-left" />
	<div id="menu-sub"><a href="index.html">Home</a><span class="menu-sub-sep"> | </span><a href="features.html">Features</a><span class="menu-sub-sep"> | </span><a href="download.html">Download</a><span class="menu-sub-sep"> | </span><a href="compat.html">Compatibility</a><span class="menu-sub-sep"> | </span><a href="performance.html">Performance</a><span class="menu-sub-sep"> | </span><a href="">Dev Documentation</a><span class="menu-sub-sep"> | </span><a href="">Blog</a><span class="menu-sub-sep"> | </span><a href="people.html">People</a><span class="menu-sub-sep"> | </span><a href="contact.html">Contact</a><span class="menu-sub-sep"> | </span><a href="py3donate.html">Py3k donations</a><span class="menu-sub-sep"> | </span><a href="numpydonate.html">NumPy donations</a><span class="menu-sub-sep"> | </span><a href="tmdonate.html">STM/AME donations</a></div>
	<hr class="clear" />
<div id="content">
<div id="main">
<h1 class="title">Performance</h1>
<p>One of the goals of the PyPy project is to provide a fast and compliant
python interpreter. Some of the ways we achieve this are by providing a
high-performance garbage collector (GC) and a high-performance
Just-in-Time compiler (JIT).  Results of comparing PyPy and CPython can
be found on the <a class="reference external" href="">speed website</a>. Those benchmarks are not a random
collection: they are a combination of real-world Python programs &ndash;
benchmarks originally included with the (now dead) Unladen Swallow
project &ndash; and benchmarks for which we found PyPy to be slow (and improved).
Consult the descriptions of each for details.</p>
<p>The JIT, however, is not a magic bullet. There are several characteristics
that might surprise people who are not used to JITs in
general or to the PyPy JIT in particular.  The JIT is generally good at
speeding up straight-forward Python code that spends a lot of time in the
bytecode dispatch loop, i.e., running actual Python code &ndash; as opposed
to running things that only are invoked by Python code.  Good
examples include numeric calculations or any kind of heavily
object-oriented program.  Bad examples include doing computations with
large longs &ndash; which is performed by unoptimizable support code.  When the
JIT cannot help, PyPy is generally slower than CPython.</p>
<p>More specifically, the JIT is known not to work on:</p>
<ul class="simple">
<li><strong>Tests</strong>: The ideal unit tests execute each piece of tested code
once.  This leaves no time for the JIT to warm up.</li>
<li><strong>Really short-running scripts</strong>: A rule of thumb is if something runs below
0.2s the JIT has no chance, but it depends a lot on the program in question.
In general, make sure you warm up your program before running benchmarks, if
you're measuring something long-running like a server.  The time required
to warm up the JIT varies; give it at least a couple of seconds.  (PyPy's
JIT takes an especially long time to warm up.)</li>
<li><strong>Long-running runtime functions</strong>: These are the functions provided
by the runtime of PyPy that do a significant amount of work.
PyPy's runtime is generally not as optimized as CPython's and we expect those
functions to take somewhere between the same time as CPython to twice as long.
This includes, for example, computing with longs, or sorting large lists.
A counterexample is regular expressions: although they take time, they
come with their own JIT.</li>
<p>Unrelated things that we know PyPy to be slow at (note that we're probably
working on it):</p>
<ul class="simple">
<li><strong>Building very large dicts</strong>: At present, this is an issue with our GCs.
Building large lists works much better; the random order of
dictionary elements is what hurts performance right now.</li>
<li><strong>CPython C extension modules</strong>: Any C extension module recompiled
with PyPy takes a very large hit in performance.  PyPy supports C
extension modules solely to provide basic functionality.
If the extension module is for speedup purposes only, then it
makes no sense to use it with PyPy at the moment.  Instead, remove it
and use a native Python implementation, which also allows opportunities
for JIT optimization.  If the extension module is
both performance-critical and an interface to some C library, then it
might be worthwhile to consider rewriting it as a pure Python version
that uses something like <tt class="docutils literal">ctypes</tt> for the interface.</li>
<li><strong>Missing RPython modules</strong>: A few modules of the standard library
(like <tt class="docutils literal">csv</tt> and <tt class="docutils literal">cPickle</tt>) are written in C in CPython, but written
natively in pure Python in PyPy.  Sometimes the JIT is able to do a
good job on them, and sometimes not.  In most cases (like <tt class="docutils literal">csv</tt> and
<tt class="docutils literal">cPickle</tt>), we're slower than CPython, with the notable exception of
<tt class="docutils literal">json</tt> and <tt class="docutils literal">heapq</tt>.</li>
<li><strong>Abuse of itertools</strong>: The itertools module is often &ldquo;abused&rdquo; in the
sense that it is used for the wrong purposes.  From our point of view,
itertools is great if you have iterations over millions of items, but
not for most other cases.  It gives you 3 lines in functional style
that replace 10 lines of Python loops (longer but arguably much easier
to read).  The pure Python version is generally not slower even on
CPython, and on PyPy it allows the JIT to work much better &ndash; simple
Python code is fast.  The same argument also applies to <tt class="docutils literal">filter()</tt>,
<tt class="docutils literal">reduce()</tt>, and to some extend <tt class="docutils literal">map()</tt> (although the simple case
is JITted), and to all usages of the <tt class="docutils literal">operator</tt> module we can think
<li><strong>Ctypes</strong>: Ctypes is a mixed bunch. If you're lucky you'll hit the
sweetspot and be <strong>really</strong> fast. If you're unlucky, you'll miss the
sweetspot and hit the slowpath which is much slower than CPython (2-10x
has been reported).</li>
<p>We generally consider things that are slower on PyPy than CPython to be bugs
of PyPy.  If you find some issue that is not documented here,
please report it to our <a class="reference external" href="">bug tracker</a> for investigation.</p>
<div id="sidebar">