Commits

Anonymous committed aa6ed93

Move pypy/doc/jit to rpython/doc/jit.

  • Participants
  • Parent commits f56db6a
  • Branches rpython-doc

Comments (0)

Files changed (6)

File pypy/doc/jit/index.rst

-========================================================================
-                          JIT documentation
-========================================================================
-
-:abstract:
-
-    When PyPy is translated into an executable such as ``pypy-c``, the
-    executable contains a full virtual machine that can optionally
-    include a Just-In-Time compiler.  This JIT compiler is **generated
-    automatically from the interpreter** that we wrote in RPython.
-
-    This JIT Compiler Generator can be applied on interpreters for any
-    language, as long as the interpreter itself is written in RPython
-    and contains a few hints to guide the JIT Compiler Generator.
-
-
-Content
-------------------------------------------------------------
-
-- Overview_: motivating our approach
-
-- Notes_ about the current work in PyPy
-
-- Hooks_ debugging facilities available to a python programmer
-
-
-.. _Overview: overview.html
-.. _Notes: pyjitpl5.html
-.. _Hooks: ../jit-hooks.html

File pypy/doc/jit/overview.rst

-------------------------------------------------------------------------
-                   Motivating JIT Compiler Generation
-------------------------------------------------------------------------
-
-.. contents::
-
-This is a non-technical introduction and motivation for PyPy's approach
-to Just-In-Time compiler generation.
-
-
-Motivation
-========================================================================
-
-Overview
---------
-
-Writing an interpreter for a complex dynamic language like Python is not
-a small task, especially if, for performance goals, we want to write a
-Just-in-Time (JIT) compiler too.
-
-The good news is that it's not what we did.  We indeed wrote an
-interpreter for Python, but we never wrote any JIT compiler for Python
-in PyPy.  Instead, we use the fact that our interpreter for Python is
-written in RPython, which is a nice, high-level language -- and we turn
-it *automatically* into a JIT compiler for Python.
-
-This transformation is of course completely transparent to the user,
-i.e. the programmer writing Python programs.  The goal (which we
-achieved) is to support *all* Python features -- including, for example,
-random frame access and debuggers.  But it is also mostly transparent to
-the language implementor, i.e. to the source code of the Python
-interpreter.  It only needs a bit of guidance: we had to put a small
-number of hints in the source code of our interpreter.  Based on these
-hints, the *JIT compiler generator* produces a JIT compiler which has
-the same language semantics as the original interpreter by construction.
-This JIT compiler itself generates machine code at runtime, aggressively
-optimizing the user's program and leading to a big performance boost,
-while keeping the semantics unmodified.  Of course, the interesting bit
-is that our Python language interpreter can evolve over time without
-getting out of sync with the JIT compiler.
-
-
-The path we followed
---------------------
-
-Our previous incarnations of PyPy's JIT generator were based on partial
-evaluation. This is a well-known and much-researched topic, considered
-to be very promising. There have been many attempts to use it to
-automatically transform an interpreter into a compiler. However, none of
-them have lead to substantial speedups for real-world languages. We
-believe that the missing key insight is to use partial evaluation to
-produce just-in-time compilers, rather than classical ahead-of-time
-compilers.  If this turns out to be correct, the practical speed of
-dynamic languages could be vastly improved.
-
-All these previous JIT compiler generators were producing JIT compilers
-similar to the hand-written Psyco.  But today, starting from 2009, our
-prototype is no longer using partial evaluation -- at least not in a way
-that would convince paper reviewers.  It is instead based on the notion
-of *tracing JIT,* recently studied for Java and JavaScript.  When
-compared to all existing tracing JITs so far, however, partial
-evaluation gives us some extra techniques that we already had in our
-previous JIT generators, notably how to optimize structures by removing
-allocations.
-
-The closest comparison to our current JIT is Tamarin's TraceMonkey.
-However, this JIT compiler is written manually, which is quite some
-effort.  In PyPy, we write a JIT generator at the level of RPython,
-which means that our final JIT does not have to -- indeed, cannot -- be
-written to encode all the details of the full Python language.  These
-details are automatically supplied by the fact that we have an
-interpreter for full Python.
-
-
-Practical results
------------------
-
-The JIT compilers that we generate use some techniques that are not in
-widespread use so far, but they are not exactly new either.  The point
-we want to make here is not that we are pushing the theoretical limits
-of how fast a given dynamic language can be run.  Our point is: we are
-making it **practical** to have reasonably good Just-In-Time compilers
-for all dynamic languages, no matter how complicated or non-widespread
-(e.g. Open Source dynamic languages without large industry or academic
-support, or internal domain-specific languages).  By practical we mean
-that this should be:
-
-* Easy: requires little more efforts than writing the interpreter in the
-  first place.
-
-* Maintainable: our generated JIT compilers are not separate projects
-  (we do not generate separate source code, but only throw-away C code
-  that is compiled into the generated VM).  In other words, the whole
-  JIT compiler is regenerated anew every time the high-level interpreter
-  is modified, so that they cannot get out of sync no matter how fast
-  the language evolves.
-
-* Fast enough: we can get some rather good performance out of the
-  generated JIT compilers.  That's the whole point, of course.
-
-
-Alternative approaches to improve speed
-========================================================================
-
-+----------------------------------------------------------------------+
-| :NOTE:                                                               |
-|                                                                      |
-|   Please take the following section as just a statement of opinion.  |
-|   In order to be debated over, the summaries should first be         |
-|   expanded into full arguments.  We include them here as links;      |
-|   we are aware of them, even if sometimes pessimistic about them     |
-|   ``:-)``                                                            |
-+----------------------------------------------------------------------+
-
-There are a large number of approaches to improving the execution speed of
-dynamic programming languages, most of which only produce small improvements
-and none offer the flexibility and customisability provided by our approach.
-Over the last 6 years of tweaking, the speed of CPython has only improved by a
-factor of 1.3 or 1.4 (depending on benchmarks).  Many tweaks are applicable to
-PyPy as well. Indeed, some of the CPython tweaks originated as tweaks for PyPy.
-
-IronPython initially achieved a speed of about 1.8 times that of CPython by
-leaving out some details of the language and by leveraging the large investment
-that Microsoft has put into making the .NET platform fast; the current, more
-complete implementation has roughly the same speed as CPython.  In general, the
-existing approaches have reached the end of the road, speed-wise.  Microsoft's
-Dynamic Language Runtime (DLR), often cited in this context, is essentially
-only an API to make the techniques pioneered in IronPython official.  At best,
-it will give another small improvement.
-
-Another technique regularly mentioned is adding types to the language in order
-to speed it up: either explicit optional typing or soft typing (i.e., inferred
-"likely" types).  For Python, all projects in this area have started with a
-simplified subset of the language; no project has scaled up to anything close
-to the complete language.  This would be a major effort and be platform- and
-language-specific.  Moreover maintenance would be a headache: we believe that
-many changes that are trivial to implement in CPython, are likely to invalidate
-previous carefully-tuned optimizations.
-
-For major improvements in speed, JIT techniques are necessary.  For Python,
-Psyco gives typical speedups of 2 to 4 times - up to 100 times in algorithmic
-examples.  It has come to a dead end because of the difficulty and huge costs
-associated with developing and maintaining it.  It has a relatively poor
-encoding of language semantics - knowledge about Python behavior needs to be
-encoded by hand and kept up-to-date.  At least, Psyco works correctly even when
-encountering one of the numerous Python constructs it does not support, by
-falling back to CPython.  The PyPy JIT started out as a metaprogrammatic,
-non-language-specific equivalent of Psyco.
-
-A different kind of prior art are self-hosting JIT compilers such as Jikes.
-Jikes is a JIT compiler for Java written in Java. It has a poor encoding of
-language semantics; it would take an enormous amount of work to encode all the
-details of a Python-like language directly into a JIT compiler.  It also has
-limited portability, which is an issue for Python; it is likely that large
-parts of the JIT compiler would need retargetting in order to run in a
-different environment than the intended low-level one.
-
-Simply reusing an existing well-tuned JIT like that of the JVM does not
-really work, because of concept mismatches between the implementor's
-language and the host VM language: the former needs to be compiled to
-the target environment in such a way that the JIT is able to speed it up
-significantly - an approach which essentially has failed in Python so
-far: even though CPython is a simple interpreter, its Java and .NET
-re-implementations are not significantly faster.
-
-More recently, several larger projects have started in the JIT area.  For
-instance, Sun Microsystems is investing in JRuby, which aims to use the Java
-Hotspot JIT to improve the performance of Ruby. However, this requires a lot of
-hand crafting and will only provide speedups for one language on one platform.
-Some issues are delicate, e.g., how to remove the overhead of constantly boxing
-and unboxing, typical in dynamic languages.  An advantage compared to PyPy is
-that there are some hand optimizations that can be performed, that do not fit
-in the metaprogramming approach.  But metaprogramming makes the PyPy JIT
-reusable for many different languages on many different execution platforms.
-It is also possible to combine the approaches - we can get substantial speedups
-using our JIT and then feed the result to Java's Hotspot JIT for further
-improvement.  One of us is even a member of the `JSR 292`_ Expert Group
-to define additions to the JVM to better support dynamic languages, and
-is contributing insights from our JIT research, in ways that will also
-benefit PyPy.
-
-Finally, tracing JITs are now emerging for dynamic languages like
-JavaScript with TraceMonkey.  The code generated by PyPy is very similar
-(but not hand-written) to the concepts of tracing JITs.
-
-
-Further reading
-========================================================================
-
-The description of the current PyPy JIT generator is given in PyJitPl5_
-(draft).
-
-.. _`JSR 292`: http://jcp.org/en/jsr/detail?id=292
-.. _PyJitPl5: pyjitpl5.html

File pypy/doc/jit/pyjitpl5.rst

-==========
- PyJitPl5
-==========
-
-This document describes the fifth generation of PyPy's JIT.
-
-
-Implementation of the JIT
-=========================
-
-The JIT's `theory`_ is great in principle, but the actual code is a different
-story. This section tries to give a high level overview of how PyPy's JIT is
-implemented.  It's helpful to have an understanding of how the `RPython translation
-toolchain`_ works before digging into the sources.
-
-Almost all JIT specific code is found in pypy/jit subdirectories.  Translation
-time code is in the codewriter directory.  The metainterp directory holds
-platform independent code including the the tracer and the optimizer.  Code in
-the backend directory is responsible for generating machine code.
-
-.. _`theory`: overview.html
-.. _`RPython translation toolchain`: ../translation.html
-
-
-JIT hints
----------
-
-To add a JIT to an interpreter, PyPy only requires that two hints be added to
-the target interpreter.  These are jit_merge_point and can_enter_jit.
-jit_merge_point is supposed to go at the start of opcode dispatch.  It allows
-the JIT to bail back to the interpreter in case running machine code is no
-longer suitable.  can_enter_jit goes at the end of a application level loop.  In
-the Python interpreter, this is the JUMP_ABSOLUTE bytecode.  The Python
-interpreter defines its hints in pypy/module/pypyjit/interp_jit.py in a few
-overridden methods of the default interpreter loop.
-
-An interpreter wishing to use the PyPy's JIT must define a list of *green*
-variables and a list of *red* variables.  The *green* variables are loop
-constants.  They are used to identify the current loop.  Red variables are for
-everything else used in the execution loop.  For example, the Python interpreter
-passes the code object and the instruction pointer as greens and the frame
-object and execution context as reds.  These objects are passed to the JIT at
-the location of the JIT hints.
-
-
-JIT Generation
---------------
-
-After the RTyping phase of translation, where high level Python operations are
-turned into low-level ones for the backend, the translation driver calls
-apply_jit() in metainterp/warmspot.py to add a JIT compiler to the currently
-translating interpreter.  apply_jit() decides what assembler backend to use then
-delegates the rest of the work to the WarmRunnerDesc class.  WarmRunnerDesc
-finds the two JIT hints in the function graphs.  It rewrites the graph
-containing the jit_merge_point hint, called the portal graph, to be able to
-handle special JIT exceptions, which indicate special conditions to the
-interpreter upon exiting from the JIT.  The location of the can_enter_jit hint
-is replaced with a call to a function, maybe_compile_and_run in warmstate.py,
-that checks if current loop is "hot" and should be compiled.
-
-Next, starting with the portal graph, codewriter/\*.py converts the graphs of the
-interpreter into JIT bytecode.  Since this bytecode is stored in the final
-binary, it's designed to be concise rather than fast.  The bytecode codewriter
-doesn't "see" (what it sees is defined by the JIT's policy) every part of the
-interpreter.  In these cases, it simply inserts an opaque call.
-
-Finally, translation finishes, including the bytecode of the interpreter in the
-final binary, and interpreter is ready to use the runtime component of the JIT.
-
-
-Tracing
--------
-
-Application code running on the JIT-enabled interpreter starts normally; it is
-interpreted on top of the usual evaluation loop.  When an application loop is
-closed (where the can_enter_jit hint was), the interpreter calls the
-maybe_compile_and_run() method of WarmEnterState.  This method increments a
-counter associated with the current green variables.  When this counter reaches
-a certain level, usually indicating the application loop has been run many
-times, the JIT enters tracing mode.
-
-*Tracing* is where JIT interprets the bytecode, generated at translation time,
-of the interpreter interpreting the application level code.  This allows it to
-see the exact operations that make up the application level loop.  Tracing is
-performed by MetaInterp and MIFrame classes in metainterp/pyjitpl.py.
-maybe_compile_and_run() creates a MetaInterp and calls its
-compile_and_run_once() method.  This initializes the MIFrame for the input
-arguments of the loop, the red and green variables passed from the
-jit_merge_point hint, and sets it to start interpreting the bytecode of the
-portal graph.
-
-Before starting the interpretation, the loop input arguments are wrapped in a
-*box*.  Boxes (defined in metainterp/history.py) wrap the value and type of a
-value in the program the JIT is interpreting.  There are two main varieties of
-boxes: constant boxes and normal boxes.  Constant boxes are used for values
-assumed to be known during tracing.  These are not necessarily compile time
-constants.  All values which are "promoted", assumed to be constant by the JIT
-for optimization purposes, are also stored in constant boxes.  Normal boxes
-contain values that may change during the running of a loop.  There are three
-kinds of normal boxes: BoxInt, BoxPtr, and BoxFloat, and four kinds of constant
-boxes: ConstInt, ConstPtr, ConstFloat, and ConstAddr.  (ConstAddr is only used
-to get around a limitation in the translation toolchain.)
-
-The meta-interpreter starts interpreting the JIT bytecode.  Each operation is
-executed and then recorded in a list of operations, called the trace.
-Operations can have a list of boxes they operate on, arguments.  Some operations
-(like GETFIELD and GETARRAYITEM) also have special objects that describe how
-their arguments are laid out in memory.  All possible operations generated by
-tracing are listed in metainterp/resoperation.py.  When a (interpreter-level)
-call to a function the JIT has bytecode for occurs during tracing, another
-MIFrame is added to the stack and the tracing continues with the same history.
-This flattens the list of operations over calls.  Most importantly, it unrolls
-the opcode dispatch loop.  Interpretation continues until the can_enter_jit hint
-is seen.  At this point, a whole iteration of the application level loop has
-been seen and recorded.
-
-Because only one iteration has been recorded the JIT only knows about one
-codepath in the loop.  For example, if there's a if statement construct like
-this::
-
-   if x:
-       do_something_exciting()
-   else:
-       do_something_else()
-
-and ``x`` is true when the JIT does tracing, only the codepath
-``do_something_exciting`` will be added to the trace.  In future runs, to ensure
-that this path is still valid, a special operation called a *guard operation* is
-added to the trace.  A guard is a small test that checks if assumptions the JIT
-makes during tracing are still true.  In the example above, a GUARD_TRUE guard
-will be generated for ``x`` before running ``do_something_exciting``.
-
-Once the meta-interpreter has verified that it has traced a loop, it decides how
-to compile what it has.  There is an optional optimization phase between these
-actions which is covered future down this page.  The backend converts the trace
-operations into assembly for the particular machine.  It then hands the compiled
-loop back to the frontend.  The next time the loop is seen in application code,
-the optimized assembly can be run instead of the normal interpreter.
-
-
-Optimizations
--------------
-
-The JIT employs several techniques, old and new, to make machine code run
-faster.
-
-Virtuals and Virtualizables
-***************************
-
-A *virtual* value is an array, struct, or RPython level instance that is created
-during the loop and does not escape from it via calls or longevity past the
-loop.  Since it is only used by the JIT, it can be "optimized out"; the value
-doesn't have to be allocated at all and its fields can be stored as first class
-values instead of deferencing them in memory.  Virtuals allow temporary objects
-in the interpreter to be unwrapped.  For example, a W_IntObject in the PyPy can
-be unwrapped to just be its integer value as long as the object is known not to
-escape the machine code.
-
-A *virtualizable* is similar to a virtual in that its structure is optimized out
-in the machine code.  Virtualizables, however, can escape from JIT controlled
-code.
-
-Other optimizations
-*******************
-
-Most of the JIT's optimizer is contained in the subdirectory
-``metainterp/optimizeopt/``.  Refer to it for more details.
-
-
-More resources
-==============
-
-More documentation about the current JIT is available as a first published
-article:
-
-* `Tracing the Meta-Level: PyPy's Tracing JIT Compiler`__
-
-.. __: http://codespeak.net/svn/pypy/extradoc/talk/icooolps2009/bolz-tracing-jit-final.pdf
-
-as well as the `blog posts with the JIT tag.`__
-
-.. __: http://morepypy.blogspot.com/search/label/jit

File rpython/doc/jit/index.rst

+========================================================================
+                          JIT documentation
+========================================================================
+
+:abstract:
+
+    When PyPy is translated into an executable such as ``pypy-c``, the
+    executable contains a full virtual machine that can optionally
+    include a Just-In-Time compiler.  This JIT compiler is **generated
+    automatically from the interpreter** that we wrote in RPython.
+
+    This JIT Compiler Generator can be applied on interpreters for any
+    language, as long as the interpreter itself is written in RPython
+    and contains a few hints to guide the JIT Compiler Generator.
+
+
+Content
+------------------------------------------------------------
+
+- Overview_: motivating our approach
+
+- Notes_ about the current work in PyPy
+
+- Hooks_ debugging facilities available to a python programmer
+
+
+.. _Overview: overview.html
+.. _Notes: pyjitpl5.html
+.. _Hooks: ../jit-hooks.html

File rpython/doc/jit/overview.rst

+------------------------------------------------------------------------
+                   Motivating JIT Compiler Generation
+------------------------------------------------------------------------
+
+.. contents::
+
+This is a non-technical introduction and motivation for PyPy's approach
+to Just-In-Time compiler generation.
+
+
+Motivation
+========================================================================
+
+Overview
+--------
+
+Writing an interpreter for a complex dynamic language like Python is not
+a small task, especially if, for performance goals, we want to write a
+Just-in-Time (JIT) compiler too.
+
+The good news is that it's not what we did.  We indeed wrote an
+interpreter for Python, but we never wrote any JIT compiler for Python
+in PyPy.  Instead, we use the fact that our interpreter for Python is
+written in RPython, which is a nice, high-level language -- and we turn
+it *automatically* into a JIT compiler for Python.
+
+This transformation is of course completely transparent to the user,
+i.e. the programmer writing Python programs.  The goal (which we
+achieved) is to support *all* Python features -- including, for example,
+random frame access and debuggers.  But it is also mostly transparent to
+the language implementor, i.e. to the source code of the Python
+interpreter.  It only needs a bit of guidance: we had to put a small
+number of hints in the source code of our interpreter.  Based on these
+hints, the *JIT compiler generator* produces a JIT compiler which has
+the same language semantics as the original interpreter by construction.
+This JIT compiler itself generates machine code at runtime, aggressively
+optimizing the user's program and leading to a big performance boost,
+while keeping the semantics unmodified.  Of course, the interesting bit
+is that our Python language interpreter can evolve over time without
+getting out of sync with the JIT compiler.
+
+
+The path we followed
+--------------------
+
+Our previous incarnations of PyPy's JIT generator were based on partial
+evaluation. This is a well-known and much-researched topic, considered
+to be very promising. There have been many attempts to use it to
+automatically transform an interpreter into a compiler. However, none of
+them have lead to substantial speedups for real-world languages. We
+believe that the missing key insight is to use partial evaluation to
+produce just-in-time compilers, rather than classical ahead-of-time
+compilers.  If this turns out to be correct, the practical speed of
+dynamic languages could be vastly improved.
+
+All these previous JIT compiler generators were producing JIT compilers
+similar to the hand-written Psyco.  But today, starting from 2009, our
+prototype is no longer using partial evaluation -- at least not in a way
+that would convince paper reviewers.  It is instead based on the notion
+of *tracing JIT,* recently studied for Java and JavaScript.  When
+compared to all existing tracing JITs so far, however, partial
+evaluation gives us some extra techniques that we already had in our
+previous JIT generators, notably how to optimize structures by removing
+allocations.
+
+The closest comparison to our current JIT is Tamarin's TraceMonkey.
+However, this JIT compiler is written manually, which is quite some
+effort.  In PyPy, we write a JIT generator at the level of RPython,
+which means that our final JIT does not have to -- indeed, cannot -- be
+written to encode all the details of the full Python language.  These
+details are automatically supplied by the fact that we have an
+interpreter for full Python.
+
+
+Practical results
+-----------------
+
+The JIT compilers that we generate use some techniques that are not in
+widespread use so far, but they are not exactly new either.  The point
+we want to make here is not that we are pushing the theoretical limits
+of how fast a given dynamic language can be run.  Our point is: we are
+making it **practical** to have reasonably good Just-In-Time compilers
+for all dynamic languages, no matter how complicated or non-widespread
+(e.g. Open Source dynamic languages without large industry or academic
+support, or internal domain-specific languages).  By practical we mean
+that this should be:
+
+* Easy: requires little more efforts than writing the interpreter in the
+  first place.
+
+* Maintainable: our generated JIT compilers are not separate projects
+  (we do not generate separate source code, but only throw-away C code
+  that is compiled into the generated VM).  In other words, the whole
+  JIT compiler is regenerated anew every time the high-level interpreter
+  is modified, so that they cannot get out of sync no matter how fast
+  the language evolves.
+
+* Fast enough: we can get some rather good performance out of the
+  generated JIT compilers.  That's the whole point, of course.
+
+
+Alternative approaches to improve speed
+========================================================================
+
++----------------------------------------------------------------------+
+| :NOTE:                                                               |
+|                                                                      |
+|   Please take the following section as just a statement of opinion.  |
+|   In order to be debated over, the summaries should first be         |
+|   expanded into full arguments.  We include them here as links;      |
+|   we are aware of them, even if sometimes pessimistic about them     |
+|   ``:-)``                                                            |
++----------------------------------------------------------------------+
+
+There are a large number of approaches to improving the execution speed of
+dynamic programming languages, most of which only produce small improvements
+and none offer the flexibility and customisability provided by our approach.
+Over the last 6 years of tweaking, the speed of CPython has only improved by a
+factor of 1.3 or 1.4 (depending on benchmarks).  Many tweaks are applicable to
+PyPy as well. Indeed, some of the CPython tweaks originated as tweaks for PyPy.
+
+IronPython initially achieved a speed of about 1.8 times that of CPython by
+leaving out some details of the language and by leveraging the large investment
+that Microsoft has put into making the .NET platform fast; the current, more
+complete implementation has roughly the same speed as CPython.  In general, the
+existing approaches have reached the end of the road, speed-wise.  Microsoft's
+Dynamic Language Runtime (DLR), often cited in this context, is essentially
+only an API to make the techniques pioneered in IronPython official.  At best,
+it will give another small improvement.
+
+Another technique regularly mentioned is adding types to the language in order
+to speed it up: either explicit optional typing or soft typing (i.e., inferred
+"likely" types).  For Python, all projects in this area have started with a
+simplified subset of the language; no project has scaled up to anything close
+to the complete language.  This would be a major effort and be platform- and
+language-specific.  Moreover maintenance would be a headache: we believe that
+many changes that are trivial to implement in CPython, are likely to invalidate
+previous carefully-tuned optimizations.
+
+For major improvements in speed, JIT techniques are necessary.  For Python,
+Psyco gives typical speedups of 2 to 4 times - up to 100 times in algorithmic
+examples.  It has come to a dead end because of the difficulty and huge costs
+associated with developing and maintaining it.  It has a relatively poor
+encoding of language semantics - knowledge about Python behavior needs to be
+encoded by hand and kept up-to-date.  At least, Psyco works correctly even when
+encountering one of the numerous Python constructs it does not support, by
+falling back to CPython.  The PyPy JIT started out as a metaprogrammatic,
+non-language-specific equivalent of Psyco.
+
+A different kind of prior art are self-hosting JIT compilers such as Jikes.
+Jikes is a JIT compiler for Java written in Java. It has a poor encoding of
+language semantics; it would take an enormous amount of work to encode all the
+details of a Python-like language directly into a JIT compiler.  It also has
+limited portability, which is an issue for Python; it is likely that large
+parts of the JIT compiler would need retargetting in order to run in a
+different environment than the intended low-level one.
+
+Simply reusing an existing well-tuned JIT like that of the JVM does not
+really work, because of concept mismatches between the implementor's
+language and the host VM language: the former needs to be compiled to
+the target environment in such a way that the JIT is able to speed it up
+significantly - an approach which essentially has failed in Python so
+far: even though CPython is a simple interpreter, its Java and .NET
+re-implementations are not significantly faster.
+
+More recently, several larger projects have started in the JIT area.  For
+instance, Sun Microsystems is investing in JRuby, which aims to use the Java
+Hotspot JIT to improve the performance of Ruby. However, this requires a lot of
+hand crafting and will only provide speedups for one language on one platform.
+Some issues are delicate, e.g., how to remove the overhead of constantly boxing
+and unboxing, typical in dynamic languages.  An advantage compared to PyPy is
+that there are some hand optimizations that can be performed, that do not fit
+in the metaprogramming approach.  But metaprogramming makes the PyPy JIT
+reusable for many different languages on many different execution platforms.
+It is also possible to combine the approaches - we can get substantial speedups
+using our JIT and then feed the result to Java's Hotspot JIT for further
+improvement.  One of us is even a member of the `JSR 292`_ Expert Group
+to define additions to the JVM to better support dynamic languages, and
+is contributing insights from our JIT research, in ways that will also
+benefit PyPy.
+
+Finally, tracing JITs are now emerging for dynamic languages like
+JavaScript with TraceMonkey.  The code generated by PyPy is very similar
+(but not hand-written) to the concepts of tracing JITs.
+
+
+Further reading
+========================================================================
+
+The description of the current PyPy JIT generator is given in PyJitPl5_
+(draft).
+
+.. _`JSR 292`: http://jcp.org/en/jsr/detail?id=292
+.. _PyJitPl5: pyjitpl5.html

File rpython/doc/jit/pyjitpl5.rst

+==========
+ PyJitPl5
+==========
+
+This document describes the fifth generation of PyPy's JIT.
+
+
+Implementation of the JIT
+=========================
+
+The JIT's `theory`_ is great in principle, but the actual code is a different
+story. This section tries to give a high level overview of how PyPy's JIT is
+implemented.  It's helpful to have an understanding of how the `RPython translation
+toolchain`_ works before digging into the sources.
+
+Almost all JIT specific code is found in pypy/jit subdirectories.  Translation
+time code is in the codewriter directory.  The metainterp directory holds
+platform independent code including the the tracer and the optimizer.  Code in
+the backend directory is responsible for generating machine code.
+
+.. _`theory`: overview.html
+.. _`RPython translation toolchain`: ../translation.html
+
+
+JIT hints
+---------
+
+To add a JIT to an interpreter, PyPy only requires that two hints be added to
+the target interpreter.  These are jit_merge_point and can_enter_jit.
+jit_merge_point is supposed to go at the start of opcode dispatch.  It allows
+the JIT to bail back to the interpreter in case running machine code is no
+longer suitable.  can_enter_jit goes at the end of a application level loop.  In
+the Python interpreter, this is the JUMP_ABSOLUTE bytecode.  The Python
+interpreter defines its hints in pypy/module/pypyjit/interp_jit.py in a few
+overridden methods of the default interpreter loop.
+
+An interpreter wishing to use the PyPy's JIT must define a list of *green*
+variables and a list of *red* variables.  The *green* variables are loop
+constants.  They are used to identify the current loop.  Red variables are for
+everything else used in the execution loop.  For example, the Python interpreter
+passes the code object and the instruction pointer as greens and the frame
+object and execution context as reds.  These objects are passed to the JIT at
+the location of the JIT hints.
+
+
+JIT Generation
+--------------
+
+After the RTyping phase of translation, where high level Python operations are
+turned into low-level ones for the backend, the translation driver calls
+apply_jit() in metainterp/warmspot.py to add a JIT compiler to the currently
+translating interpreter.  apply_jit() decides what assembler backend to use then
+delegates the rest of the work to the WarmRunnerDesc class.  WarmRunnerDesc
+finds the two JIT hints in the function graphs.  It rewrites the graph
+containing the jit_merge_point hint, called the portal graph, to be able to
+handle special JIT exceptions, which indicate special conditions to the
+interpreter upon exiting from the JIT.  The location of the can_enter_jit hint
+is replaced with a call to a function, maybe_compile_and_run in warmstate.py,
+that checks if current loop is "hot" and should be compiled.
+
+Next, starting with the portal graph, codewriter/\*.py converts the graphs of the
+interpreter into JIT bytecode.  Since this bytecode is stored in the final
+binary, it's designed to be concise rather than fast.  The bytecode codewriter
+doesn't "see" (what it sees is defined by the JIT's policy) every part of the
+interpreter.  In these cases, it simply inserts an opaque call.
+
+Finally, translation finishes, including the bytecode of the interpreter in the
+final binary, and interpreter is ready to use the runtime component of the JIT.
+
+
+Tracing
+-------
+
+Application code running on the JIT-enabled interpreter starts normally; it is
+interpreted on top of the usual evaluation loop.  When an application loop is
+closed (where the can_enter_jit hint was), the interpreter calls the
+maybe_compile_and_run() method of WarmEnterState.  This method increments a
+counter associated with the current green variables.  When this counter reaches
+a certain level, usually indicating the application loop has been run many
+times, the JIT enters tracing mode.
+
+*Tracing* is where JIT interprets the bytecode, generated at translation time,
+of the interpreter interpreting the application level code.  This allows it to
+see the exact operations that make up the application level loop.  Tracing is
+performed by MetaInterp and MIFrame classes in metainterp/pyjitpl.py.
+maybe_compile_and_run() creates a MetaInterp and calls its
+compile_and_run_once() method.  This initializes the MIFrame for the input
+arguments of the loop, the red and green variables passed from the
+jit_merge_point hint, and sets it to start interpreting the bytecode of the
+portal graph.
+
+Before starting the interpretation, the loop input arguments are wrapped in a
+*box*.  Boxes (defined in metainterp/history.py) wrap the value and type of a
+value in the program the JIT is interpreting.  There are two main varieties of
+boxes: constant boxes and normal boxes.  Constant boxes are used for values
+assumed to be known during tracing.  These are not necessarily compile time
+constants.  All values which are "promoted", assumed to be constant by the JIT
+for optimization purposes, are also stored in constant boxes.  Normal boxes
+contain values that may change during the running of a loop.  There are three
+kinds of normal boxes: BoxInt, BoxPtr, and BoxFloat, and four kinds of constant
+boxes: ConstInt, ConstPtr, ConstFloat, and ConstAddr.  (ConstAddr is only used
+to get around a limitation in the translation toolchain.)
+
+The meta-interpreter starts interpreting the JIT bytecode.  Each operation is
+executed and then recorded in a list of operations, called the trace.
+Operations can have a list of boxes they operate on, arguments.  Some operations
+(like GETFIELD and GETARRAYITEM) also have special objects that describe how
+their arguments are laid out in memory.  All possible operations generated by
+tracing are listed in metainterp/resoperation.py.  When a (interpreter-level)
+call to a function the JIT has bytecode for occurs during tracing, another
+MIFrame is added to the stack and the tracing continues with the same history.
+This flattens the list of operations over calls.  Most importantly, it unrolls
+the opcode dispatch loop.  Interpretation continues until the can_enter_jit hint
+is seen.  At this point, a whole iteration of the application level loop has
+been seen and recorded.
+
+Because only one iteration has been recorded the JIT only knows about one
+codepath in the loop.  For example, if there's a if statement construct like
+this::
+
+   if x:
+       do_something_exciting()
+   else:
+       do_something_else()
+
+and ``x`` is true when the JIT does tracing, only the codepath
+``do_something_exciting`` will be added to the trace.  In future runs, to ensure
+that this path is still valid, a special operation called a *guard operation* is
+added to the trace.  A guard is a small test that checks if assumptions the JIT
+makes during tracing are still true.  In the example above, a GUARD_TRUE guard
+will be generated for ``x`` before running ``do_something_exciting``.
+
+Once the meta-interpreter has verified that it has traced a loop, it decides how
+to compile what it has.  There is an optional optimization phase between these
+actions which is covered future down this page.  The backend converts the trace
+operations into assembly for the particular machine.  It then hands the compiled
+loop back to the frontend.  The next time the loop is seen in application code,
+the optimized assembly can be run instead of the normal interpreter.
+
+
+Optimizations
+-------------
+
+The JIT employs several techniques, old and new, to make machine code run
+faster.
+
+Virtuals and Virtualizables
+***************************
+
+A *virtual* value is an array, struct, or RPython level instance that is created
+during the loop and does not escape from it via calls or longevity past the
+loop.  Since it is only used by the JIT, it can be "optimized out"; the value
+doesn't have to be allocated at all and its fields can be stored as first class
+values instead of deferencing them in memory.  Virtuals allow temporary objects
+in the interpreter to be unwrapped.  For example, a W_IntObject in the PyPy can
+be unwrapped to just be its integer value as long as the object is known not to
+escape the machine code.
+
+A *virtualizable* is similar to a virtual in that its structure is optimized out
+in the machine code.  Virtualizables, however, can escape from JIT controlled
+code.
+
+Other optimizations
+*******************
+
+Most of the JIT's optimizer is contained in the subdirectory
+``metainterp/optimizeopt/``.  Refer to it for more details.
+
+
+More resources
+==============
+
+More documentation about the current JIT is available as a first published
+article:
+
+* `Tracing the Meta-Level: PyPy's Tracing JIT Compiler`__
+
+.. __: http://codespeak.net/svn/pypy/extradoc/talk/icooolps2009/bolz-tracing-jit-final.pdf
+
+as well as the `blog posts with the JIT tag.`__
+
+.. __: http://morepypy.blogspot.com/search/label/jit