Commits

holger krekel  committed 7926d02

moved rest of doc to "extradoc" (i think it is better
to mark this doc-directory as being extra, otherwise
people might look here for dev-documentation when
just browsing the repository)

  • Participants

Comments (0)

Files changed (18)

File Diagrams.sxi

Binary file added.

File amsterdam.sxi

Binary file added.

File api_html.tar.gz

Binary file added.
+Pypy Documentation 
+==================
+
+We have a fair amount of documentation for the Pypy project. The files
+are available from the website as html (view them along the left side of
+the pypy-doc webpage). They are also available from the repository,
+under the *doc/* directory or under the *doc/devel* sub-directory. Or,
+to catch up on what we've been up to lately, just peek at the
+recently-modified_ documents page.
+
+Overview
+--------
+
+If you just want an overview of the project, take a look at these items in *doc/*.
+
+ * architecture_:
+	a more technical overview of the current architecture 
+
+ * oscon2003-paper_:
+	presentation to OSCON on what pypy is about and why you should care
+
+
+Getting Started
+---------------
+
+If you want to get involved, take a look at the following documentation  to get a better taste:
+
+These file are in the *doc/* directory:
+
+ * howtopypy_:
+	provides some hands-on instructions for getting started
+	
+
+ * readme_:
+	this file is on using ReST for pypy documentation
+
+
+ * wrapping_:
+	a description of application-level and interpreter-level wrapped 		objects
+
+This file is in the *doc/devel/* sub-directory:
+
+ * howtosvn_:
+	for new users of subversion
+
+Before you code
+---------------
+
+Before doing pypy work, you should also take a look at these developer-specific instructions, found in the *doc/devel/* sub-directory of the repository:
+
+ * coding-style_:
+	covers pypy coding conventions
+
+	
+
+ * optionaltool_:
+	there are some optional tools we use for pypy. 
+
+ * testdesign_:
+	pypy is a test-driven development project.read here to find out more 	about how we're doing testing.
+
+Further reading
+---------------
+
+* An interesting thread on an HP tech report that may be proof the pypy is feasible_ . (We already knew that...)
+
+* An interesting thread on why VHLL rock_ . (We already knew that too.)
+	
+* A thread on Python in Scheme_ .
+
+* An intriguting project, FlashMob_ - creating an adhoc supercomputer.
+
+* A discussion on Python and lisp_ support
+
+* An interesting repository_ of papers by Xerox Parc members, with quite a few issues more or less relevant to PyPy.
+
+* A thread on the gnu lightning_ project."GNU lightning is a library that generates assembly language code at run-time; it is very fast, making it ideal for Just-In-Time compilers, and it abstracts over the target CPU, as it exposes to the clients a standardized RISC instruction set inspired by the MIPS and SPARC chips."
+
+* A project to create a Low Level Virtual Machine (LLVM_) and a PyPy-LLVM_ discussion, and conversation_ between PyPy and LLVM.
+
+* A thread discussing the xhelix_ python C extension implementing Helix encryption and authentication, which may be interesting to use as a pypy performance test at some point.
+
+* A paper for PyCon 2004: "IronPython_ is a new implementation of the Python language targeting the Common Language Runtime (CLR). It compiles python programs into bytecode (IL) that will run on either Microsoft's .NET or the Open Source Mono platform. IronPython includes an interactive interpreter and transparent on-the-fly compilation of source files just like standard Python. In addition, IronPython supports static compilation of Python code to produce static executables (.exe's) that can be run directly or static libraries (.dll's) that can be called from other CLR languages."
+
+* A comparison of Python and Pliant_ , an OS written in a python-like language. 
+
+
+.. _architecture: http://codespeak.net/pypy/index.cgi?doc/architecture.html
+.. _oscon2003-paper: http://codespeak.net/pypy/index.cgi?doc/oscon2003-paper.html
+.. _howtopypy: http://codespeak.net/pypy/index.cgi?doc/howtopypy.html
+.. _readme: http://codespeak.net/pypy/index.cgi?doc/readme.html
+.. _wrapping: http://codespeak.net/pypy/index.cgi?doc/wrapping.html
+.. _coding-style: http://codespeak.net/pypy/index.cgi?doc/devel/coding-style.html
+.. _howtosvn: http://codespeak.net/pypy/index.cgi?doc/devel/howtosvn.html
+.. _optionaltool: http://codespeak.net/pypy/index.cgi?doc/devel/optionaltool.html
+.. _testdesign: http://codespeak.net/pypy/index.cgi?doc/devel/testdesign.html
+.. _feasible: http://codespeak.net/pipermail/pypy-dev/2004q2/001289.html
+.. _rock: http://codespeak.net/pipermail/pypy-dev/2004q1/001255.html
+.. _Scheme: http://codespeak.net/pipermail/pypy-dev/2004q1/001256.html
+.. _FlashMob: http://www.flashmobcomputing.org/
+.. _lisp: http://codespeak.net/pipermail/pypy-dev/2003q4/001048.html
+.. _repository: http://www2.parc.com/csl/groups/sda/publications.shtml
+.. _lightning: http://codespeak.net/pipermail/pypy-dev/2003q4/001051.html
+.. _LLVM: http://llvm.cs.uiuc.edu/
+.. _PyPy-LLVM: http://codespeak.net/pipermail/pypy-dev/2003q4/001115.html
+.. _conversation: http://codespeak.net/pipermail/pypy-dev/2003q4/001119.html
+.. _xhelix: http://codespeak.net/pipermail/pypy-dev/2003q4/001129.html
+.. _IronPython: http://www.python.org/pycon/dc2004/papers/9/
+.. _pliant: http://pliant.cx 
+.. _recently-modified: http://codespeak.net/pypy/index.cgi?doc/recent

File irclog/annotations.txt

+About annotations
+=================
+
+We are running into limitations of the annotation system used for type inference.
+This document describes these limitations and how to slightly move the concepts
+around to fix them, and probably also how the whole issues occurred from having
+mixed concepts in wrong ways in the first place.
+
+Irc log from October, the 28th::
+
+  <arigo> sanxiyn: ok for a few words about annotations?
+  <sanxiyn> yep!
+  <sanxiyn> (sorry for being out; I forgot it...)
+  <arigo> np
+  <arigo> mutable structures pose some problems
+  <sanxiyn> e.g.
+  <arigo> because you cannot say "len(x) = 5" if 'x' is a list, of course
+  <arigo> because the length of x could change
+  <arigo> so just propagating the annotation is wrong
+  <sanxiyn> ah.
+  <arigo> it's more annoying to say e.g. that x is a list of integers
+  <sanxiyn> Is it annoying?
+  <arigo> getitem(x, anything) = y & type(y) = int
+  <sanxiyn> yep.
+  <arigo> but what if you call f(x)
+  <arigo> and f adds strings to the list x ?
+  <sanxiyn> I think RPython list shall be homogenous.
+  <arigo> yes, but:
+  <arigo> x = []
+  <arigo> f(x)
+  <arigo> then f is allowed to put strings in x
+  <sanxiyn> ah, empty list thing...
+  <arigo> yes but also:
+  <arigo> x = ['hello']
+  <arigo> f(x)
+  <sanxiyn> ML languages have precisely the same problem, aren't they?
+  <arigo> yes but i think we can solve it here
+  <arigo> but we need to be careful
+  <sanxiyn> special casing empty list should work. (IIRC that's how it's done in ML, basically)
+  <arigo> yes but i think we can solve it here (didn't i say that already :-)
+  <sanxiyn> agreed. so let's solve it;
+  <sanxiyn> :)
+  <arigo> won't help verbosity, but let's think bout that later.
+  <sanxiyn> List length seems to be impossible to guarantee.
+  
+  <arigo> we can say:
+  <arigo> deref(x) = z ; getitem(z, anything) = y ; type(y) = int
+  <arigo> here x is our variable, but z is a Cell()
+  <arigo> so the list has a life of its own, independently from the variable it is in
+  * sanxiyn reads it carefully.
+  <arigo> what i'm thinking about is this:
+  <arigo> we would have (conceptually) a single big pool of annotation
+  <arigo> not one AnnotationSet per basic block
+  <arigo> only one, for the whole program
+  <sanxiyn> Yes. I found annset per block annoying, and felt that it's that way for no real reason.
+  <arigo> we would map variables to this big annotation set
+  <arigo> this must probably still be done for each block independently
+  <arigo> each block would have a map {variable: cell-in-the-big-annset}
+  <arigo> or maybe not
+  <sanxiyn> hm
+  <arigo> because variables are supposed to be unique anyway
+  <arigo> still, i think the big annset should not use variables at all, just cells and constants.
+  <sanxiyn> comments in get_variables_ann say otherwise, but I suspect it's outdated.
+  <arigo> "supposed" to be unique... no, they still aren't really
+  <sanxiyn> eh, confused.
+  <arigo> the comment is not outdated
+  <sanxiyn> what does it mean, then?
+  <arigo> the same Variable() is still used in several blocks
+  <arigo> that should be fixed
+  <sanxiyn> indeed.
+  <sanxiyn> I commented out XXX: variables must not be shared, and ran test_pyrextrans, and got 6 failures.
+  <arigo> yes
+  <arigo> all EggBlocks are wrong, currently
+  <sanxiyn> I don't know what Spam/Egg Blocks are.
+  <arigo> :-)
+  <sanxiyn> Don't know at all.
+  <arigo> it's funny names describing how the block was built
+  <arigo> they are all Blocks
+  <arigo> an EggBlock is used after a fork
+  <sanxiyn> fork?
+  <arigo> a split, after a block with two exits
+  <arigo> but that's not relevant to the other transformations
+  <arigo> which can simplify the graph after it is built
+  
+  <arigo> we could have a single big annset
+  <arigo> it represents "the heap" of an abstract CPython process
+  <sanxiyn> hm.
+  <arigo> i.e. objects in the heap
+  <arigo> like lists, integers, all of them
+  <arigo> using Cell() to represent abstract objects, and Constant() for concrete ones
+  <arigo> then a variable is only something which appears in the basic block's SpaceOperations
+  * arigo is confused
+  <sanxiyn> So Variable() points to Cell().
+  <arigo> yes...
+  <arigo> currently we cannot handle mutable lists because:
+  <arigo> getitem(x, *) = z
+  <arigo> is an annotation talking about the variable x
+  <arigo> so we cannot propagate the annotation forth and back to called sub-functions
+  <arigo> instead, getitem should talk about an object, not the variable that points to it
+  <sanxiyn> exactly!
+  <sanxiyn> That's Python-think. :)
+  <sanxiyn> http://starship.python.net/crew/mwh/hacks/objectthink.html
+  <sanxiyn> Is mwh's wonderful piece "How to think like a Pythonista" relevant here?
+  * arigo tries to do 4 things at the same times and fails to
+  <sanxiyn> So variables are names.
+  <sanxiyn> It binds.
+  <arigo> yes
+  <sanxiyn> mwh wrote: "I find the world variable to be particularly unhelpful in a Python context..."
+  <sanxiyn> with wonderful diagrams :)
+  <hpk> yah, introducing namespaces into abstract-interpretation world! :-)
+  <sanxiyn> namespace? eh, not exactly, I think...
+  <arigo> hpk: yes, each block is its own namespace here :-)
+  <arigo> and obviously we need "heap objects" that these names can refer to
+  <hpk> (namespaces in the meaning of "living" bindings between names and objects)
+  <sanxiyn> So "objects" are actually cells unless constant-propagated...
+  <arigo> yes...
+  <arigo> i think we could even go for a full-Prolog representation:
+  <arigo> the "big heap" contains cells and constants.  cells can become constants when we know more about them.
+  * sanxiyn should read Borges and Calvino as Martellibot suggested. :)
+  <arigo> seems cleaner than the current cell-variable-constant mix.
+  <arigo> in other words, a SpaceOperation uses variables only,
+  <arigo> and the variable can refer to a cell or a constant from the heap...
+  <arigo> the point is that the objects in the heap can be manipulated
+  <arigo> say a variable v1 points to a cell c
+  <arigo> with type(c) = list and len(c) = 3
+  <sanxiyn> v2 = v1 and v1 points to the same cell c.
+  <sanxiyn> you modify v2 and v1 is modified, too, etc.
+  <arigo> yes exactly
+  <arigo> if you append an item to the list then the annotation len(c) = 3 is deleted
+  <sanxiyn> Is "prolog" a pronoun for "non-determinism"?
+  <arigo> Logic Programming i think
+  
+  <sanxiyn> arigo: I think that solves "reflow".
+  <arigo> sanxiyn: yes, possibly
+  <arigo> you can add annotations freely, at least
+  <arigo> that's fine
+  <arigo> we'll just need a trick to delete ("retract") annotations
+  <arigo> because other annotations may depend on this one
+  <arigo> like type(c3)=int is only valid if type(c1)=int and type(c2)=int because we used an 'add' operation
+  <sanxiyn> Currently flowin does similar thing.
+  <sanxiyn> It recomputes all annotations if len(annset) is decreased.
+  <arigo> sanxiyn: yes, but it should work without the need to re-flowin
+  <sanxiyn> eh?
+  <sanxiyn> without re-flowin?
+  <arigo> if you delete an annotation, then you must recompute annotations recursively on the rest of the graph
+  <sanxiyn> yes, how to avoid that?
+  <arigo> we can record dependencies
+  <arigo> each annotation "knows" that it depends on some other ones
+  <hpk> question is if there are different ways of "depending" or just one way
+  <arigo> hpk: right
+  <hpk> in a way a space operation modifying the assertions denotes 'edges' in this dependency graph? 
+  <arigo> yes
+  <sanxiyn> I think annotation should know about *others* which depend on itself, not which itself depends on.
+  <arigo> yes
+  <arigo> when you kill an annotation, just follow the forward dependencies to kill the ones it depends on
+  <sanxiyn> So not dependency... reverse dependency? :)
+  <arigo> forward dependency... ?
+  <sanxiyn> Should be easy to add.
+  <hpk> "reasons"? 
+  <hpk> origin? 
+  <sanxiyn> hpk: no, consequences.
+  <sanxiyn> hpk: neither reason nor origin.
+  <arigo> "dependents" ?
+  <sanxiyn> As in SF novel "time patrol", if you change the past, the future is all changed.
+  <sanxiyn> how about consequences? I'm not good at naming...
+  <hpk> too long :-)
+  <sanxiyn> implication
+  <sanxiyn> too long ;
+  <arigo> consequences is fine if you don't have to type it too often :-)
+  <hpk> hmmm. 
+  <arigo> i guess we need an Annotation class whose constructor takes a list of dependencies, and records 'self' in these dependencies' "consequences" or whatever
+  <sanxiyn> I think only deletion routine need to refer it.
+  
+...cut. So if you have a good name for that attributes, speak up :-)

File irclog/llvm.txt

+LLVM
+====
+
+First discussion about using LLVM as a target language.
+LLVM (Low Level Virtual Machine) is a Compiler Infrastructure;
+see http://llvm.cs.uiuc.edu/.
+
+Irc log from October, the 28th::
+
+  <stackless>	So I smell something growing here...
+  <sanxiyn>	arigo: thanks. I lost some of up-logs... so I asked.
+  <arigo>	stackless: nice
+  <stackless>	on that assembly target: How is their source code? Had no time to look. I hope
+  <stackless>	they don't use huge ugly other languages like ML?
+  <sanxiyn>	stackless: good for you! I thank Richard Emslie, I thank Richard Emslie (he repeats)
+  <hpk>	arigo: uh, bob ippolito just wrote that LLVM is all C++
+  <arigo>	stackless: i doubt it
+  <sanxiyn>	Yep. LLVM is in C++.
+  <sanxiyn>	arigo: so logging & summary is for you (evil grin)
+  <arigo>	sanxiyn: yes
+  <arigo>	stackless: i think they are using fast custom back-ends for runtime code generation
+  <stackless>	arigo: that sounds like what I like.
+  <arigo>	stackless: they also mentioned grabbing parts of GCC
+  <hpk>	who bothers - we have to have some binding with C++ then :-)
+  <arigo>	stackless: or ideas and AST structures at least
+  <hpk>	but they seemed to like to move away from it (because of licensing issues)
+  <stackless>	well, they might like PyPy and decide to become part of the project, supporting us.
+  *	sanxiyn baffles, "ML is neither huge nor ugly!"
+  <arigo>	stackless: yes !
+  <arigo>	in all cases i think that a genllvm.py should be easy to write
+  <hpk>	right
+  <arigo>	and if their compilers are good it could be faster than C
+  *	stackless apologises, didn't mean ML, probably. But last time he looked into C--, he was unhappy to pull so much tings in...
+  <arigo>	because it has a lot of meta-information
+  <arigo>	not only types, but single-step-assignment guarantees no aliasing, whereas GCC tries hard to find out what could alias what
+  <stackless>	single-step-assignment is one thing I remember from C--
+  <arigo>	yes
+  <arigo>	it's a good idea
+  <stackless>	really good. They never have expressions in function calls.
+  <arigo>	and it's natural for intermediate languages like our flow graphs
+  <stackless>	Instead, order of evaluation is crystal clear.
+  <sanxiyn>	I think FlowModel has that property too...
+  <arigo>	yes
+  <sanxiyn>	since it's derived from Python bytecode... etc.
+  <arigo>	interesting stuff from the e-mail at http://mail.cs.uiuc.edu/pipermail/llvmdev/2003-October/000501.html
+  <arigo>	"programs which have high-degree basic blocks"
+  <arigo>	high-degree mean (unless i'm mistaken) a lot of inputargs
+  <arigo>	we have a lot of them indeed
+  <hpk>	yes!
+  <hpk>	that's what'
+  <arigo>	that's a problem when using languages like ML as intermediate languages
+  <arigo>	you can write functions with 23 arguments
+  <arigo>	but the compiler isn't optimized for that
+  <arigo>	i tried, it produces bad code :-)
+  <sanxiyn>	Ah, I heard it from Lisp gotchas, i.e. it's easier to write slow code in Lisp.
+  <sanxiyn>	(it specifically mentioned problem with multiple value return optimization. sounds similar.)
+  <sanxiyn>	btw, what is SSA...
+  <arigo>	single-step assignment (never write to a variable more than once)
+  <sanxiyn>	I'm not sure how does it help, but I don't know much about this area.
+  <hpk>	i a m just skimming the source code
+  <hpk>	looks nice and readable
+  <hpk>	and good inline documentation it seems
+  <hpk>	it is c++ though :-)
+  <sanxiyn>	arigo: Was thinking more about SpaceOp/Annset. It's a constraint-based programming.
+  <arigo>	yes, constrain propagation...
+  <sanxiyn>	That's what Screamer (sorry, don't know about others. this one is Lisp) do very well...
+  <sanxiyn>	Integer range analysis and all goodies.
+  *	hpk has not often seen such nice c++ code ...
+  <sanxiyn>	So it's not really a new idea. But that means we have lots of expereince to learn from.
+  <arigo>	sanxiyn: yes
+  *	sanxiyn loads screamer intro he downloaded but have never read.
+  <hpk>	hmmm, it's really a high level c++ code, probably pretty easy to convert to python (the parts i have seen)
+  <sanxiyn>	How much code is LLVM?
+  <hpk>	i have no idea
+  <hpk>	i just read the commit mails
+  <sanxiyn>	ls
+  <sanxiyn>	PyPy is currently 39844 lines of code.
+  <sanxiyn>	(22000 of them is PyPy, 16000 Pyrex.)
+  <hpk>	what? 
+  <hpk>	16000 pyrex? what do you mean? 
+  <sanxiyn>	Plex + Pyrex is 16000 lines.
+    
+    Oct 28 16:10:18 -->	pedronis (~sp@91.51.202.62.dial.bluewin.ch) has joined #pypy
+  <sanxiyn>	Hello.
+  <pedronis>	hi
+  <hpk>	pedronis: hi samuele
+  <arigo>	hi samuele
+  *	sanxiyn downloads LLVM 1.0
+  <sanxiyn>	hpk: what do you think about line count? :)
+  *	arigo downloads LLVM 1.0 too
+  <pedronis>	why do we need to be so fast with  LLVM, is why they want to setup a public repo and we want to offer hosting it?
+  <sanxiyn>	we don't need to be hasty, right.
+  <sanxiyn>	hpk: eh. should I register to download?
+  <hpk>	i just did :-)
+  <arigo>	so did i :-)
+  <hpk>	with real name and all :-)
+  <sanxiyn>	me too.
+  <hpk>	pedronis: it cant hurt to contact them informally and see/talk about ideas i think
+  <sanxiyn>	well. it's *huge*;
+  <hpk>	pedronis: if we find out that we were over-enthusiatic we have not lost much, i think
+  <arigo>	pedronis: i think their project is interesting, for PyPy or not, and holger talked about offering hosting
+  <arigo>	pedronis: but mostly i'm sure if llvm is well written it is excellent for PyPy
+  <arigo>	pedronis: this needs to be checked and discussed of course
+  <pedronis>	arigo: what I'm not sure, and we should ask is how much they are interested in optimization for VHLL
+  <arigo>	as opposed to C-like languages ?
+  <pedronis>	arigo: yup, it seems that LLVM need to extended for thing like exact GC, or some possible lookup opts for VHLL
+  <arigo>	yes, i think the VHLL is supposed to do language-specific optimizations itself
+  *	sanxiyn metions Parrot... not.
+  <arigo>	and only emit a low-level code that contains enough information for good low-level optimization
+  <sanxiyn>	Parrot is the only explicitly VHLL VM I know of.
+  <pedronis>	arigo: it seems they are interested in things like region-based memory allocation etc
+  <arigo>	yes, which is fine i think
+  <pedronis>	arigo: which goes more in the device driver, OS kernel direction
+  <hpk>	quote: The Python test classes are more UNIX-centric than they should be, so porting to non-UNIX like platforms 
+  <arigo>	we can have refcounted regions and garbage-collected ones
+  <hpk>	(i thought it's interesting that they are using python for something :-)
+  <sanxiyn>	hpk: Many projects use Python for unittesting, but usually they have not much to do with Python.
+  <sanxiyn>	For example, svn uses Python for unittesting.
+  <hpk>	sanxiyn: sure, but it's still significant information 
+  <arigo>	pedronis: llvm is definitely a low-level tool
+  <hpk>	and BIND and whatnot
+  <sanxiyn>	Yes. It tells us they know about Python. :)
+  <pedronis>	arigo: yes, the point is whether they are happy extending it to support non-low-level stuff
+  <arigo>	pedronis: i'm thinking about it at least as a very good alternative to C for the translator
+  <arigo>	pedronis: but i think they would be happy to design some "hooks" needed for high-level languages
+  <arigo>	pedronis: they don't have Java yet for example but mention wanting to look in that direction
+  <pedronis>	arigo: OK, so using the their static compiler?
+  <arigo>	pedronis: at least
+  <arigo>	pedronis: we should try to write "genllvm.py"
+  <sanxiyn>	If RPython can be translated to C, it surely can be translated to LLVM.
+  <sanxiyn>	And moreover, as Psyco do (perhaps I'm wrong here), some Applevel Python function may be able to be JITted by (LLVM or whatever).
+  <arigo>	pedronis: i think the experiment is worth being made
+  <arigo>	sanxiyn: yes, that's what is beyond my "at least" :-)
+  <pedronis>	arigo: well the experiment is cheap
+  <sanxiyn>	arigo: Will you post log and summary for binding concept and forward-dependency, constraint-based programming?
+  <arigo>	sanxiyn: yes
+  <arigo>	pedronis: yes
+  <sanxiyn>	topic is moving farther and farther from that.
+  <arigo>	sanxiyn: i've saved the relevant parts, will edit them when i've a minute
+  <sanxiyn>	ah, ok.
+  <pedronis>	arigo: my issue is how much their JIT is usable and drivable at runtime, and intergation with things like GC etc
+  <pedronis>	arigo: OTOH yes as target of the translator, that another situation
+  <arigo>	pedronis: yes for the JIT it needs more investigation
+  <arigo>	pedronis: for full Psyco i'd need compilation of basic-blocks-at-a-time (not whole functions at a time)
+  <pedronis>	arigo: yes, I know that, is one of the thing I was wondering about
+  <sanxiyn>	I remeber Psyco does very complex things to accomplish that.
+  <arigo>	pedronis: right now i'm pretty enthusiastic because the LLVM language is just the same as our flowgraphs, so we could probably at least have a JIT for RPython
+  
+    Oct 28 16:31:22 -->	faassen (~faassen@a213-84-57-72.adsl.xs4all.nl) has joined #pypy
+  <arigo>	hi martijn
+  <pedronis>	arigo: yes or just static compilation
+  <pedronis>	arigo: it seems they are investigating trace-based techniques like Dynamo
+  <arigo>	pedronis: actually, i don't know many projects with a good runtime compiler that accepts an in-memory SSA representation of code
+  <faassen>	hey.
+  <arigo>	pedronis: this alone makes llvm interesting, for many projects that I can think about besides or on top of PyPy
+  <sanxiyn>	So LLVM is already a rare case?
+  <hpk>	what really impresses me is how their website and the source code is done
+  <hpk>	faassen: hi martijn
+  <faassen>	hpk: hey! :)
+  <arigo>	pedronis: trace techniques are nice, Psyco's profiler is a bit primitive
+  <sanxiyn>	website is impressive. I don't know C++ very well to judge the code. :(
+  <hpk>	sanxiyn: trust me it's better than average :-)
+  <faassen>	what website is that? :)
+  <arigo>	pedronis: at this point i think we should at least consider using llvm even if we have to change a bit the C++ code to add a couple of instructions.
+  <hpk>	http://llvm.cs.uiuc.edu/#subprojects
+    
+... cut at Martijn's arrival :-)

File oscon2003-paper.txt

+Implementing Python in Python 
+==============================
+
+A report from the PyPy project
+******************************
+The PyPy_ [#]_ project aims at producing a simple runtime-system for
+the Python_ language, written in Python itself.  **C** and **Lisp**
+are elder examples of languages which are self-hosting.  More
+recently, we have seen implementations of **Scheme** in Scheme_, [#]_
+and **Squeak**, [#]_ [#]_ a Smalltalk_ implementation of Smalltalk_.
+The desire to implement your favourite language *in* your
+favourite language is quite understandable.  Every significant
+computer language has a certain expressiveness and power, and it is
+frustrating to not be able to use that expressiveness and power when
+writing the language itself.
+
+.. _PyPy: http://www.codespeak.net/pypy/
+.. _Python: http://www.python.org/
+.. _Scheme: http://www.swiss.ai.mit.edu/projects/scheme/
+.. _Squeak: http://www.squeak.org
+.. _Smalltalk: http://www.smalltalk.org/
+
+Thus we aim to produce a minimal core which is *simple* and
+*flexible*, and no longer dependent on CPython [#]_.  This should make
+PyPy_ easier than CPython to analyze, change and debug. We will take
+care that PyPy will integrate easily with Psyco_ [#]_ and Stackless_, [#]_ 
+while trying to avoid unwitting C dependencies in our thinking.
+
+.. _Psyco: http://psyco.sourceforge.net/
+.. _Stackless: http://www.stackless.com/
+
+We should be able to produce different versions of PyPy which run
+on different architectures,  for instance one that runs on the 
+Java Virtual Machine, much as Jython_ [#]_ does today.  
+
+.. _Jython: http://www.jython.org/
+
+By keeping things *simple* and *flexible* we can produce code that has
+attractions for both industry and academia.  Academics will find that
+this Python is even easier to teach concepts of language design with.
+Industry will be pleased to know that ending the dependence on CPython
+means that we can produce a Python with a smaller footprint.
+Eventually, we would like to produce a faster Python , which should
+please all.  We are very far from that now, because speed is a distant
+goal.  So far we have only worked on making PyPy *simple* and
+*flexible*.
+
+Most of you know what happens if you type::
+     
+     import this
+
+at your favourite Python prompt.  You get *The Zen of Python*, [#]_
+written by Tim Peters.  It starts::
+
+     Beautiful is better than ugly.
+     Explicit is better than implicit.
+
+and ends with::
+
+     Namespaces are one honking great idea -- let's do more of those!
+
+This raises an interesting question.  What would *doing more of those*  
+mean?  The PyPy project takes one approach.
+
+
+Terminology (a short digression)
+********************************
+
+In PyPy there is a distinction between **application level code**, which
+is the world that PyPy is interpreting, and which can use the full features 
+of the language, and the **interpreter level code** which is the world
+that CPython is interpreting.  The interpreter level code
+needs to be written in a restricted subset of Python.
+(Currently you are mainly restricted to immutable objects; no dicts, you can
+use globals but you cannot modify them.  *What defines Restricted Python?*
+is a matter of current debate.)
+
+In a Python-like language, a running interpreter has three main parts:
+  
+  * the main loop, which shuffles data around and calls the operations defined in the object library according to the bytecode.
+  * the compiler, which represents the static optimization of the source code into an intermediate format, the bytecode;  and
+  * the object library, implementing the various types of objects and their semantics;
+
+In PyPy, the three parts are clearly separated and can be replaced
+independently.  The main loop generally assumes little about the semantics
+of the objects: they are essentially black boxes (PyObject pointers). The
+interpreter stack and the variables only contain such black boxes.
+Every operation is done via calls to the object library, such as
+PyNumber_Add().  We haven't done much to make the compiler and the main
+loop into explicit concepts (yet),  because we have been concentrating
+on making separable object libraries.
+
+We call the separable object library, an *Object Space*.
+We call the black boxes of an Object Space *Wrapped Objects*.
+
+One exciting thing is that while existing languages implement _one_
+Object Space, by separating things we have produced an architecture
+which will enable us to run more than one Object Space in the same
+interpreter at the same time.  This idea has some interesting implications.
+
+But first let us dream for a bit.  (Aside from having fun, why should
+we spend our time writing PyPy?)
+
+Goals:
+++++++
+or Dreams, if you prefer
+
+
+A Slimmer Python
+++++++++++++++++
+People who write code for handhelds and other embedded devices often
+wish that they could have a much smaller footprint.  With PyPy it
+would be possible to load a Tiny Object Space which only implements
+the behaviour which they need, and skips the parts that they do not.
+
+A Native Reference Language
++++++++++++++++++++++++++++
+Currently, we have two widely-used implementations of Python, CPython,
+and Jython.  Whenever they differ, the question always comes up: Is
+Jython *wrong*?  By this, people mean, is this behaviour which exists
+in CPython, a matter of the language definition, which Jython ought to
+support, or is it instead an irrelevant detail, the historical
+accident of how things happened to be implemented in CPython, which
+Jython is free to ignore?  It would be useful to have an independent
+Reference Language, written in Python itself to stand as the model of
+what is compliant Python.  A PyPy Object Space will provide this.
+Moreover, people who would like to experiment with a proposed Python
+language change will have an easier task.  Proposed new language
+features, could profit from first being written in PyPy so that more
+people could use, comment, and modify them before final approval or
+rejection.
+
+Getting better use of machines with multiple CPUs
+++++++++++++++++++++++++++++++++++++++++++++++++++
+(Also known as *Killing the Global Interpreter Lock*).  We believe
+that we have barely scratched the surface of what can be done with
+the new hardware architectures we have created.  The old idea of
+*one machine, one CPU* persists, so it is difficult to partition the
+work in such a way to keep all the CPUs occupied.  We hope to be able
+to design our interpreter so that each CPU could be busy with its own
+Object Space.
+
+Running different Object Spaces on different machines
++++++++++++++++++++++++++++++++++++++++++++++++++++++
+Well, why not?  Whenever the time needed to do the calculating exceeds
+the time needed to communicate between the various calculators, you
+will benefit by adding more CPUS.  Thus PyPy will provide you with an
+alternative way to write a Cluster.  (`The Beowulf Cluster`_ is
+probably the most famous of Cluster architectures).  Right now 'network
+computing' is in its infancy.  We don't know how to take advantage of
+the resources we have.  Ideally, one could begin a computation on a
+small device, say a mobile phone, and have the interpreter notice that
+the device is underpowered for such a computation and transparently
+forward the computation to a machine with more computational power.
+You could have *peak computing machines* in the same way that
+electrical utilities have plants which are expensive to
+run and only come on-line when demand is extremely high.  As computers
+become ubiquitous, we will *need* such demand based load sharing.
+
+.. _The Beowulf Cluster: http://www.beowulf-underground.org/index.html
+
+Multiple, Dynamically Changing Implementations of a Single Type
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+Consider the question: *What is the best way to implement a dict*?
+
+How one answers depends on how much data one intends to store.
+If the dict is never expected to have more than a half dozen items, a
+really fast list may be best.  Larger dicts might best be implemented
+as hashes.  For storing enormous amounts of data, a binary tree
+might be just what you desire.  In principle, there is
+nothing to stop your interpreter from keeping statistics on how it is
+being used, and to move from strategy to strategy at runtime.  You
+could implement this in CPython, but we intend to make it a lot
+*easier* to do this in PyPy to encourage such experimentation.
+
+A better teaching vehicle
++++++++++++++++++++++++++
+Python has proven to be an excellent first programming language.
+However, once the student develops a desire to see the nuts and bolts
+of how one implements a language, when they look under the hood, they
+find C, sometimes the sort of C one writes when speed optimisation is
+of paramont importance.  For pedagological purposes, one would prefer
+a language implementation whose chief virtue is *clarity* so that the
+concepts are illustrated cleanly.
+
+However, academic computer science is littered with tiny teaching
+languages.  Every time we get a few new ideas in language design or
+pedagogical theory we itch to create a language to express these
+ideas.  While understandable, this is wasteful.  Many languages are
+implemented which are more novel than useful, and too many are begun
+with insufficient new ideas.  At one extreme, we end up force-feeding
+our poor students with too many computer languages, too quickly --
+each language designed to teach a particular point.  Alas, many of our
+languages are particularly weak on everything *except* the point we
+wish to make.
+
+At the other extreme, many students go to university and end up only
+learning how to program in commercially successful languages.  This
+reduces university to a Giant Trade School, where it is possible to
+avoid learning Computer Science altogether.  What we need is a the
+middle way, a Pythonic way between purity and practicality, theory and
+practice.
+
+PyPy may be able to help.  The separation should make learning concepts
+easier, and the ability to create one's own Object Spaces provides a
+useful way to compare and contrast different techniques.  Finally, we
+could reasonably ask our students to **implement** interesting
+theories in Python, producing slightly different Object Spaces which
+could leave the bulk of the language implementation unchanged.
+
+There is no better way to learn about compiler writing, than writing
+compilers, but much of today's education in compiler writing leaves a
+huge gap between 'the theory that is in the book which the student is
+expected to learn' and 'what is reasonable for a student to implement
+as coursework'.  Students can spend all semester overcoming
+difficulties in *actually getting the IO to work*, and *interfacing
+with the runtime libraries*, while only spending a fraction of the
+time on the concepts which you are trying to teach.
+
+Object Spaces could provide a better fit between the the abstract
+concepts we wish to teach and the code written to implement just that.
+
+Runtime Adaptation of C-Libraries and System-Calls
+++++++++++++++++++++++++++++++++++++++++++++++++++
+Python is already widely used for integrating and driving C-libraries
+(for numerical computation, 3D-modeling etc.).  We dream
+of introducing runtime mechanisms that allow PyPy to directly setup and
+execute "native" calls on a machine.  For this to work we need
+"trampolin" (assembler-) functions that build a C-like stackframe
+and trigger a call directly into e.g. the linux kernel or 
+any C-library without having to use a C-compiler.  This technique
+would clearly be of great value to embedded devices but also
+to regular python applications that could more easily use C-libraries
+once they obtain a pythonic description of the library (possibly
+generated from ``.h`` files).
+
+A Smarter, more Dynamic Interpreter
++++++++++++++++++++++++++++++++++++ 
+A Virtual Machine written in Python, should be easier to maintain and
+optimise. By recording statistics and analysing the bytecodes that are
+running through the machine, it is possible to find a shorter, and
+faster way to run a script - the essence of optimisation. Native code
+compilers do it all the time, but obviously only once at compilation
+time. Interpreters can optimise in exactly same way, but at *run
+time*. `The Hotspot Java Virtual Machine`_ already does this.
+
+.. _The Hotspot Java Virtual Machine: http://java.sun.com/products/hotspot/docs/whitepaper/Java_Hotspot_v1.4.1/Java_HSpot_WP_v1.4.1_1002_1.html
+
+Faster Python
++++++++++++++
+(Okay, you've caught us ...)
+While we are writing an adaptive, smarter compiler, we ought to be able
+to make it faster.  We think we can produce a Just-In-Time compiler which
+is faster than C Python without destroying the clarity in our architecture.
+Indeed, the ability to run different object spaces at the same time, in the
+same interpreter will be most useful in this application.  Psyco already
+uses similar techniques to great effect.
+
+Speaking of Running different Object Spaces at the Same Time
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+This dream is a bit far-fetched, but it is worth investigating.  Code
+migration currently involves going through one's entire codebase
+looking for conflicts.  This makes converting to newer versions of the
+language both expensive and difficult.  There is a trade-off between
+getting the new features which contribute to increased productivity in
+program design, and having to fix piles of old code that wasn't broken
+until the language changed.  With multiple Object Spaces approach it
+may be possible to have your cake and eat it too.
+
+You could load your existing modules with the Object Space they were
+developed for while immediately using new features in new code that
+you develop.  It would be up to the PyPy interpreter to see that these
+Object Spaces communicate with each other transparently.  Only modules
+that would particularly benefit from having the new features, would
+be modified.  The rest could sleep peacefully, unchanged.
+
+This leads to:
+
+World Domination
+++++++++++++++++
+And if we don't pull this off, we will have at least learned a lot. This
+in itself makes the project worth doing.  Plus it's fun...
+
+But away from the dreams and back to what do we currently have?
+
+We now have a pretty good working interpreter which implements
+advanced language features such as nested scopes, generators and
+metaclasses.  Most **types** and **builtins** are either completely
+implemented or nearly there.  We have extensive unit tests, since we
+believe in test driven design, even though we don't always practice
+it.  
+
+We currently have three object spaces at least partially implemented.
+
+
+The Trivial Object Space
+++++++++++++++++++++++++
+A PyPy interpreter using the Trivial Object Space is an
+interpreter with its own main loop (written in Python), and nothing
+else.  This main loop manipulates real Python objects and all
+operations are done directly on the Python objects. For example, "1"
+really means "1" and when the interpreter encounters the BINARY_ADD
+bytecode instructions the Trivial Object Space will just add two real
+Python objects together using Python's "+". The same for lists,
+dictionaries, classes ... we just use Python's own.  Delegate Object
+Space might have been a better name for this Object Space.
+
+This Object Space is only useful for testing the concept of Object Spaces,
+and our interpreter, or even interpreting different kinds of bytecodes.
+This is already implemented; it is funny to watch *dis.dis* disassembling 
+itself painfully slowly.
+
+Getting this to work was a goal of the Hildesheim Sprint February 16-23.
+It demonstrated that our Object Space Concept was viable, and that our
+interpreter worked.
+
+The Standard Object Space
+++++++++++++++++++++++++++
+The Standard Object Space is the object space that works just like
+Python's, that is, the object space whose black boxes are real Python
+objects that work as expected. Getting the Standard Object Space to
+work was a goal of the Gothenburg Sprint May 24 - 31.
+
+The Standard Object Space defines an abstract parent class, W_Object,
+and a bunch of subclasses like W_IntObject, W_ListObject, and so on. A
+wrapped object (a *black box* for the interpreter main loop) is thus
+an instance of one of these classes. When the main loop invokes an
+operation, say the addition, between two wrapped objects w1 and w2,
+the StandardObjectSpace does some internal dispatching (similar to
+"Object/ abstract.c" in CPython) and invokes a method of the proper
+W_XyzObject class that can do the operation. The operation itself is
+done with the primitives allowed by Restricted Python. The result is
+constructed as a wrapped object again.
+
+The following was our first trivial program::
+
+ ### our first trivial program ###
+ 
+ aStr = 'hello world'
+ print len(aStr)
+
+to run.  We needed types and builtins to work.  This ran, slowly.
+
+We began testing and adding types and builtins.
+
+Getting this code to work was the second goal.::
+
+ ### a trivial program to test strings, lists, functions and methods ###
+ 
+ def addstr(s1,s2):
+     return s1 + s2
+
+ str = "an interesting string"
+ str2 = 'another::string::xxx::y:aa'
+ str3 = addstr(str,str2)
+ arr = []
+ for word in str.split():
+     if word in str2.split('::'):
+        arr.append(word)
+ print ''.join(arr)
+ print "str + str2 = ", str3
+
+This we accomplished by mid-week.
+
+By the end of the Sprint we produced our first Python program [#]_ that
+ran under PyPy which simply 'did something we wanted to do' and wasn't
+an artificial goal.  It calculated the week long foodbill, and divided
+the result by the 9 Sprint participants.::
+
+ ### the first real PyPy Program ###
+
+ slips=[(1, 'Kals MatMarkn', 6150, 'Chutney for Curry', 'dinner Saturday'),
+        (2, 'Kals MatMarkn', 32000, 'Spaghetti, Beer', 'dinner Monday'),
+        (2, 'Kals MatMarkn', -810, 'Deposit on Beer Bottles', 'various'),
+        (3, 'Fram', 7700, 'Rice and Curry Spice', 'dinner Saturday'),
+        ( ... )
+        (23, 'Fram', 2975, 'Potatoes', '3.5 kg @ 8.50SEK'),
+        (23, 'Fram', 1421, 'Peas', 'Thursday dinner'),]
+
+ print (reduce(lambda x, y: x+y, [t[2] for t in slips], 0))/900
+
+Pypy said: 603 SEK, or approximately 75 USD.   Don't believe people who
+tell you that Sprints are too expensive to hold. 
+
+The Annotation Object Space
++++++++++++++++++++++++++++
+Our third Sprint was held at Louvain-la-Neuve, Belgium (near
+Brussels), June 21 - 24.  Great progress was made with the The
+Annotation Object Space, and began abstract, symbolic interpretation.
+(We also spent a lot of time firming up the Standard Object Space, and
+improving our documentation, and our documentation tools).
+
+In the two object spaces so far, application-level objects are
+represented in the interpreter as objects that represent a value.
+This is so obvious as to not need pointing out, except for the fact
+that the Annotation space does something completely different.
+
+Here the interpreter-level object corresponding to a application-level
+variable does not describe the value of the variable, but rather the
+state of knowledge about the contents of the variable.
+For example, after the code::
+
+ x = 1
+ y = 2
+ z = x + y
+
+we know exactly what *x*, *y* and *z* contain: the integers *1*, *2* and *3*
+respectively, and this is how the annotation object space represents
+them: there is a class W_Constant that represents totally known values.
+
+However in::
+
+  def f(x, y):
+      z = x + y
+      
+  f(1, 2)
+  f(2, 3)
+
+we know less.  We know that x and y only contain integers, but their
+values are no longer entirely fixed.  In this case, the annotation
+object space could chose to represent the variable in the body of f
+as *either* the constant *1* or the constant *2*, but at present it punts
+and simply represents it as an instance of W_Integer.
+
+The eventual hope is to run all of the code that implements PyPy's
+interpreter and the standard object space with the annotation object
+space and gain sufficient knowledge of the values involved to generate
+efficient code (in C, Pyrex_, O'Caml, Java or whatever) to do the same
+job.
+
+If you're wondering how we expect to get a speed up of 20000 times by
+this translation when a speed up of 100 or so times is all that
+usually obtained by rewriting in C, you have to understand that the
+main reason for the standard object space's current slowness is the
+computation of which code to execute each time a multimethod is
+called.  The knowledge gathered by the Annontation Object Space should
+be sufficient to remove or at least substantially reduce this computation 
+for most of the call sites.
+
+Current plans are to use the information gathered from the Annotation Object 
+Space to emit Pyrex_ code which itself will generate a CPython extension.
+
+.. _Pyrex: http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/
+
+Types
++++++
+Types are implemented by the class W_TypeObject. This is where
+inheritance and the Method Resolution Order are defined, and where
+attribute look-ups are done.
+
+Instances of user-defined types are implemented as W_UserObjects. A
+user-defined type can inherit from built-in types (maybe more than
+one, although this is incompatible with CPython). The W_UserObject
+delegator converts the object into any of these "parent objects" if
+needed. This is how user-defined types appear to inherit all built-in
+operator implementations.
+
+Delegators should be able to invoke user code; this would let us
+implement special methods like __int__() by calling them within a
+W_UserObject -> int delegator.
+
+Multimethods
+++++++++++++
+Interpreter-level classes correspond to implementations of
+application-level types.  The hierarchy among the classes used for the
+implementations is convenient for implementation purposes. It is not
+related to any application-level type hierarchy.  Multimethods
+dispatch by looking in a set of registered functions. Each registered
+function has a signature, which defines which object implementation
+classes are accepted at the corresponding argument position.
+
+Specifics of multimethods
++++++++++++++++++++++++++
+Multimethods dispatch more-specific-first, left-to-right (i.e. if
+there is an exact match for the first argument it will always be tried
+first).
+
+Delegators are automatically chained (i.e. A -> B and B -> C would be
+combined to allow for A -> C delegation).
+
+Delegators do not publish the class of the converted object in
+advance, so that the W_UserObject delegator can potentially produce
+any other built-in implementation. This means chaining and chain loop
+detection cannot be done statically (at least without help from an
+analysis tool like the translator-to-C). To break loops, we can assume
+(unless a particular need arises) that delegators are looping when
+they return an object of an already-seen class.
+
+Registration
+++++++++++++
+The register() method of multimethods adds a function to its database
+of functions, with the given signature. A function that raises
+FailedToImplement causes the next match to be tried.
+
+'delegate' is the special unary multimethod that should try to convert
+its argument to something else. For greater control, it can also
+return a list of 2-tuples (class, object), or an empty list for
+failure to convert the argument to anything. All delegators will
+potentially be tried, and recursively on each other's results to do
+chaining.
+
+Multimethod slicing
++++++++++++++++++++
+Multimethods are visible to user code as (bound or unbound) methods
+defined for the corresponding types. (At some point built-in functions
+like *len()* and the *operator.xxx()* should really directly map to the
+multimethods themselves, too.)
+
+To build a method from a multimethod (e.g. as in *l.append* or
+*int.__add__*), the result is actually a "slice" of the whole
+multimethod, i.e. a sub-multimethod in which the registration table
+has been trimmed down. (Delegation mechanisms are not restricted for
+sliced multimethods.)
+
+Say that C is the class the new method is attached to (in the above
+examples, respectively, C=type(l) and C=int). The restriction is based
+on the registered class of the first argument ('self' for the new
+method) in the signature. If this class corresponds to a fixed type
+(as advertized by 'statictype'), and this fixed type is C or a
+superclass of C, then we keep it.
+
+Some multimethods can also be sliced along their second argument,
+e.g. for __radd__().
+
+A Word of History
++++++++++++++++++
+
+The PyPy project was started in January of 2003 by Armin Rigo,
+Christian Tismer and Holger Krekel. The latter organized the initial
+Coding-Sprint in Hildesheim, Germany where the interpreter and the
+Trivial Object Space were implemented and people first got together.
+The second sprint in G�teborg, Sweden was organized by Jacob Hall�n and
+Laura Creighton and it resulted in much of today's Standard Object Space
+implementation.  Benjamin Henrion and Godefroid Chapelle organized the
+third sprint in Louvain-La-Neuve, Belgium which led to a pretty complete
+Standard ObjectSpace and interpreter and the beginnings of Abstract
+Interpretation (Annotation Object Space).  These three coding
+sprints in the course of half a year brought PyPy to existence, though
+there was some off-sprint development and discussions going on.
+
+Participants
+++++++++++++
+
+..  line-block::
+
+ Laura Creighton
+ Stephan Diehl
+ Dinu Gherman
+ Jacob Hall�n
+ Michael Hudson
+ G�nter Jantzen
+ Holger Krekel
+ Anders Lehmann
+ Jens-Uwe Mager
+ Alex Martelli
+ Tomek Meka
+ Rocco Morretti
+ Samuele Pedroni
+ Anna Ravencroft
+ Armin Rigo
+ Guido van Rossum
+ Christian Tismer
+
+Conclusions
++++++++++++
+It is a little early for conclusions, but our architecture seems to be
+working so far.  Sprints are a lot of fun, and a great way to write
+code, and meet interesting people.  We're productively lazy, and so
+have created a few tools that could possibly be useful to other
+projects ... parts of our test rig, for example, and automatic ReST
+processing on checkins.  An Infastructure mini-Sprint, again at
+Hildesheim, is planned which may produce tools good enough to package
+and release separately.  
+
+Thank you
++++++++++
+As was to be expected we are using Python web applications (mailman_,
+roundup_, moinmoin_) to host our project.
+
+.. _mailman: http://www.list.org/
+.. _roundup: http://roundup.sourceforge.net/
+.. _moinmoin: http://moin.sourceforge.net/
+
+The members of the PyPy team are especially grateful to RyanAir_, without
+which holding Sprints would be prohibitively expensive, freenode.net_
+which lets us communicate with each other on the #pypy channel, and the
+Subversion_ development team, without whom restructuring the entire universe
+whenever we feel like it would have been close to impossible.
+
+.. _freenode.net: http://www.freenode.net/
+.. _RyanAir: http://www.ryanair.com/
+.. _Subversion: http://subversion.tigris.org/
+
+
+.. [#] The PyPy homepage: http://www.codespeak.net/pypy/
+.. [#] See for instance, Scheme48's PreScheme
+.. [#] The Squeak homepage: http://www.squeak.org/
+.. [#] See *Back to the Future The Story of Squeak, A Practical 
+       Smalltalk Written in Itself* ftp://st.cs.uiuc.edu/Smalltalk/Squeak/docs/OOPSLA.Squeak.html
+.. [#] CPython is what we call the commonly available Python_ which you
+       can download from http://www.python.org .  This is to distinguish it
+       from other implementations of the Python_ language, such as
+       Jython_, which is written for the Java virtual machine.
+.. [#] The Psyco homespage: http://psyco.sourceforge.net/
+.. [#] The Stackless homespage: http://www.stackless.com/
+.. [#] The Jython homespage: http://www.jython.org/
+.. [#] The complete text is as follows:
+ 
+..  line-block::
+
+   *The Zen of Python*
+    
+    by Tim Peters
+   
+..  line-block::
+
+   *Beautiful is better than ugly.
+   Explicit is better than implicit.
+   Simple is better than complex.
+   Complex is better than complicated.
+   Flat is better than nested.
+   Sparse is better than dense.
+   Readability counts.
+   Special cases aren't special enough to break the rules.
+   Although practicality beats purity.
+   Errors should never pass silently.
+   Unless explicitly silenced.
+   In the face of ambiguity, refuse the temptation to guess.
+   There should be one-- and preferably only one --obvious way to do it.
+   Although that way may not be obvious at first unless you're Dutch.
+   Now is better than never.
+   Although never is often better than _right_ now.
+   If the implementation is hard to explain, it's a bad idea.
+   If the implementation is easy to explain, it may be a good idea.
+   Namespaces are one honking great idea -- let's do more of those!*
+
+.. [#] The full text for historians and other curious people is:
+
+..  line-block::      
+     
+     slips=[
+       (1, 'Kals MatMarkn', 6150, 'Chutney for Curry', 'dinner Saturday'),
+       (2, 'Kals MatMarkn', 32000, 'Spaghetti, Beer', 'dinner Monday'),
+       (2, 'Kals MatMarkn', -810, 'Deposit on Beer Bottles', 'various'),
+       (3, 'Fram', 7700, 'Rice and Curry Spice', 'dinner Saturday'),
+       (4, 'Kals MatMarkn', 25000, 'Alcohol-Free Beer, sundries', 'various'),
+       (4, 'Kals MatMarkn', -1570, "Michael's toothpaste", 'none'),
+       (4, 'Kals MatMarkn', -1690, "Laura's toothpaste", 'none'),
+       (4, 'Kals MatMarkn', -720, 'Deposit on Beer Bottles', 'various'),
+       (4, 'Kals MatMarkn', -60, 'Deposit on another Beer Bottle', 'various'),
+       (5, 'Kals MatMarkn', 26750, 'lunch bread meat cheese', 'lunch Monday'),
+       (6, 'Kals MatMarkn', 15950, 'various', 'dinner Tuesday and Thursday'),
+       (7, 'Kals MatMarkn', 3650, 'Drottningsylt, etc.', 'dinner Thursday'),
+       (8, 'Kals MatMarkn', 26150, 'Chicken and Mushroom Sauce', 'dinner Wed'),
+       (8, 'Kals MatMarkn', -2490, 'Jacob and Laura -- juice', 'dinner Wed'),
+       (8, 'Kals MatMarkn', -2990, "Chicken we didn't cook", 'dinner Wednesday'),
+       (9, 'Kals MatMarkn', 1380, 'fruit for Curry', 'dinner Saturday'),
+       (9, 'Kals MatMarkn', 1380, 'fruit for Curry', 'dinner Saturday'),
+       (10, 'Kals MatMarkn', 26900, 'Jansons Frestelse', 'dinner Sunday'),
+       (10, 'Kals MatMarkn', -540, 'Deposit on Beer Bottles', 'dinner Sunday'),
+       (11, 'Kals MatMarkn', 22650, 'lunch bread meat cheese', 'lunch Thursday'),
+       (11, 'Kals MatMarkn', -2190, 'Jacob and Laura -- juice', 'lunch Thursday'),
+       (11, 'Kals MatMarkn', -2790, 'Jacob and Laura -- cereal', 'lunch Thurs'),
+       (11, 'Kals MatMarkn', -760, 'Jacob and Laura -- milk', 'lunch Thursday'),
+       (12, 'Kals MatMarkn', 18850, 'lunch bread meat cheese', 'lunch Friday'),
+       (13, 'Kals MatMarkn', 18850, 'lunch bread meat cheese', 'guestimate Sun'),
+       (14, 'Kals MatMarkn', 18850, 'lunch bread meat cheese', 'guestimate Tues'),
+       (15, 'Kals MatMarkn', 20000, 'lunch bread meat cheese', 'guestimate Wed'),
+       (16, 'Kals MatMarkn', 42050, 'grillfest', 'dinner Friday'),
+       (16, 'Kals MatMarkn', -1350, 'Deposit on Beer Bottles', 'dinner Friday'),
+       (17, 'System Bolaget', 15500, 'Cederlunds Caloric', 'dinner Thursday'),
+       (17, 'System Bolaget', 22400, '4 x Farnese Sangiovese 56SEK', 'various'),
+       (17, 'System Bolaget', 22400, '4 x Farnese Sangiovese 56SEK', 'various'),
+       (17, 'System Bolaget', 13800, '2 x Jacobs Creek 69SEK', 'various'),
+       (18, 'J and Ls winecabinet', 10800, '2 x Parrotes 54SEK', 'various'),
+       (18, 'J and Ls winecabinet', 14700, '3 x Saint Paulin 49SEK', 'various'),
+       (18, 'J and Ls winecabinet', 10400, '2 x Farnese Sangioves 52SEK','cheaper when we bought it'),
+       (18, 'J and Ls winecabinet', 17800, '2 x Le Poiane 89SEK', 'various'),
+       (18, 'J and Ls winecabinet', 9800, '2 x Something Else 49SEK', 'various'),
+       (19, 'Konsum', 26000, 'Saturday Bread and Fruit', 'Slip MISSING'),
+       (20, 'Konsum', 15245, 'Mooseburgers', 'found slip'),
+       (21, 'Kals MatMarkn', 20650, 'Grilling', 'Friday dinner'),
+       (22, 'J and Ls freezer', 21000, 'Meat for Curry, grilling', ''),
+       (22, 'J and Ls cupboard', 3000, 'Rice', ''),
+       (22, 'J and Ls cupboard', 4000, 'Charcoal', ''),
+       (23, 'Fram', 2975, 'Potatoes', '3.5 kg @ 8.50SEK'),
+       (23, 'Fram', 1421, 'Peas', 'Thursday dinner'),
+       (24, 'Kals MatMarkn', 20650, 'Grilling', 'Friday dinner'),
+       (24, 'Kals MatMarkn', -2990, 'TP', 'None'),
+       (24, 'Kals MatMarkn', -2320, 'T-Gul', 'None')
+       ]
+
+     print [t[2] for t in slips]
+     print (reduce(lambda x, y: x+y, [t[2] for t in slips], 0))/900

File papers/grove.pdf

Binary file added.

File psycoguide.ps.gz

Binary file added.

File pypy-talk-ep2004.txt

+
+EuroPython 2004 PyPy talk
+
+1. Motivation / small summary of our EU efforts (Laura?)
+
+2. architecture / pygame view  (Michael, Armin)
+
+3. source explanation + lower level architecture + examples (Holger)
+
+4. translation / flowgraphs / (Samuele)
+
+5. future directions / how to get involved / questions 
+   (al together)
+

File sprintinfo/AmsterdamReport.txt

+
+Amsterdam Sprint 14-21 Dec. 2003
+--------------------------------
+
+Here is a mail-report from Holger Krekel sent to pypy-dev. 
+ 
+Hello PyPy,
+
+the Amsterdam sprint has just finished and here is a report and some
+surrounding and outlook information. As usual please comment/add/correct
+me - especially the sprinters.  I also wouldn't mind some discussion of
+what and how we could do things better at the next sprint.  First of
+all, big thanks to *Etienne Posthumus* who patiently helped organizing
+this sprint even though he had to deal with various other problems at
+the same time. 
+
+Before i start with details i recommend reading through the new
+Architecture document at
+
+    http://codespeak.net/pypy/index.cgi?doc/architecture.html
+
+in case you don't know what i am talking about regarding the Amsterdam
+sprint results :-)
+
+Originally, we intended to go rather directly for translation and thus
+for a first release of PyPy.  But before the sprint we decided to
+go differently about the sprint not only because Michael Hudson and
+Christian Tismer had to cancel their participation but we also wanted to
+give a smooth introduction for the new developers attending the sprint. 
+Therefore we didn't press very hard at translation and type inference 
+and major suprises were awaiting us anyway ... 
+
+
+fixing lots and lots of bugs, adding more builtins and more introspection
+-------------------------------------------------------------------------
+
+On this front mainly Alex Martelli, Patrick Maupin, Laura Creighton and 
+Jacob Hallen added and fixed a lot of builtins and modules and made it
+possible to run - among other modules - the pystone benchmark: on most machines
+we have more than one pystone with PyPy already :-)  While trying to
+get 'long' objects working Armin and Samuele realized that the StdObjSpace 
+multimethod mechanism now desparately needs refactoring.  Thus the current
+"long" support is just another hack (TM) which nevertheless allows to execute
+more of CPython's regression tests. 
+
+In a releated effort, Samuele and yours truly made introspection of
+frames, functions and code objects compatible to CPython so that the
+"dis.dis(dis.dis)" goal finally works i.e can be run through
+PyPy/StdObjSpace. This is done by the so called pypy\_ protocol which
+an object space uses to delegate operations on core execution objects
+(functions, frames, code ...) back to the interpreter. 
+
+redefining our standard type system at application level
+--------------------------------------------------------
+
+Originally we thought that we could more or less easily redefine the python
+type objects at application level and let them access interpreter level
+objects and implementations via some hook. This turned out to be a
+bootstrapping nightmare (e.g. in order to instantiate classes you need
+type objects already but actually we want our first classes to define
+exactly those).  While each particular problem could be worked around 
+somehow Armin and Samuele realized they were opening a big can of worms ...
+and couldn't close it in due time. 
+
+The good news is that after lots of discussions and tossing ideas around
+we managed to develop a new approach (see the end of the report) which
+raised our hopes we can finally define the types at application level 
+and thus get rid of the ugly and hard to understand interpreter level
+implementation. 
+
+Improving tracing and debugging of PyPy
+---------------------------------------
+
+With PyPy you often get long tracebacks and other problems which make
+it hard to debug sometimes.  Richard Emslie, Tomek Meka and me implemented 
+a new Object Space called "TraceObjSpace" which can wrap the Trivial and 
+Standard Object Space and will trace all objectspace operations as well as 
+frame creation into a long list of events.  Richard then in a nightly hotel
+session wrote "tool/traceinteractive.py" which will nicely reveal
+what is going on if you execute python statements: which frames are created
+which bytecodes are executed and what object space operations are involved. 
+Just execute traceinteractive.py (with python 2.3) and type some random function 
+definition and see PyPy's internals at work ...  It only works with python 2.3
+because we had to rewrite python's dis-module module to allow programmatic access
+to dissassembling byte codes. And this module has considerably changed 
+from python 2.2 to 2.3 (thanks, Michael :-) 
+
+"finishing" the Annotation refactoring
+--------------------------------------
+
+That should be easy, right?  Actually Guido van Rossum and Armin had
+started doing type inference/annotation in Belgium just before
+EuroPython and we have since refactored it already at the Berlin sprint
+and in between the sprints several times.  But it turned out that
+Annotations as we did them are *utterly broken* in that we try to do a
+too general system (we had been talking about general inference engines
+and such) thus making "merging" of two annotations very hard to do in
+a meaningful way. But after beeing completly crushed on one afternoon, Samuele
+and Armin came up with a new *simpler* approach that ... worked and 
+promises to not have the same flaws.  It got integrated into the 
+translator already and appears to work nicely. 
+
+I think this is the fourth refactoring of just "three files" and, of course, we
+already have the 'XXX' virus spreading again :-) 
+
+refactoring/rewriting the test framework
+----------------------------------------
+
+PyPy has an ever increasing test-suite which requires a lot of flexibility 
+that the standard unittest.py module just doesn't provide.  Currently, we have 
+in tool/test.py and interpreter/unittest_w.py a collection of more or less
+weird hacks to make our (now over 500) tests run either at interpreter level 
+or at application level which means they are actually interpreted by PyPy. 
+Tests from both levels can moreover be run with different object spaces. 
+Thus Stefan Schwarzer and me came up with a rewrite of unittest.py which
+is currently in 'newtest.py'.  During my train ride back to germany i 
+experimentally used our new approach which let our tests run 
+around 30% faster (!) as a side effect.  More about this in separate mails
+as this is - as almost every other area of PyPy - refactoring-in-progress. 
+
+Documentation afternoon
+-----------------------
+
+We (apart from Richard who had a good excuse :-) also managed on Friday
+to devote a full afternoon to documentation.  There is now an emerging
+"architecture" document, a "howtopypy" (getting started) and a "goals"
+document here:
+
+    http://codespeak.net/pypy/index.cgi?doc
+
+Moreover, we deleted misleading or redundant wiki-pages. In case you miss
+one of them you can still access them through the web by following 
+"Info on this page" and "revision history".  
+
+We also had a small lecture from Theo de Ridder who dived into 
+our flowgraph and came up with some traditional descriptions from 
+compiler theory to describe what we are doing. He also inspired us with
+some other nice ideas and we certainly hope he revisits the projects
+and continues to try to use it for his own purposes. 
+
+There of course is no question that we still need more higher 
+level documentation. Please don't use the wiki for serious 
+documentation but make ReST-files in the doc-subtree. I guess 
+that we will make the "documentation afternoon" a permanent 
+event on our sprints. 
+
+Towards more applevel code ...
+------------------------------
+
+As mentioned before the approach of defining the python types at
+application level didn't work out as easy as hoped for.  But luckily, we
+had - again in some pub - the rescuing idea: a general mechanism that
+lets us trigger "exec/eval" of arbitrary interpreter level code given
+as a string.  Of course this by itself is far too dynamic to be
+translatable but remember: we can perform *arbitrarily dynamic* pythonic
+tricks while still *initializing* object spaces and the interpreter.  
+Translation will start with executing the initialized interpreter/objspace
+through another interpreter/flowobjspace instance. 
+
+Some hacking on the last day showed that this new approach makes the
+definition of "builtin" modules a lot more pythonic: modules are not
+complicated class instances anymore but really look like a normal module
+with some "interpreter level escapes".  It appears now that in combination
+with some other considerations we will finally get to "types.py" defining
+the python types thus getting rid of the cumbersome 10-12 type.py files
+in objspace/std.  There are still some XXX's to fight, though.  
+
+Participants 
+------------
+
+    Patrick Maupin 
+
+    Richard Emslie
+
+    Stefan Schwarzer
+
+    Theo de Ridder
+
+    Alex Martelli
+
+    Laura Creighton
+
+    Jacob Hallen
+
+    Tomek Meka
+
+    Armin Rigo
+
+    Guenter Jantzen
+
+    Samuele Pedronis
+
+    Holger Krekel
+
+and Etienne Posthumus who made our Amsterdam Sprint possible. 
+
+outlook, what comes next? 
+-------------------------
+
+On the Amsterdam sprint maybe more than ever we realized how strongly 
+refactoring is the key development activity in PyPy. Watch those "XXX" :-)
+Some would argue that we should instead think more about what we are doing 
+but then you wouldn't call that extreme programming, would you? 
+
+However, we haven't fixed a site and date for the next sprint, yet. We would 
+like to do it sometime February in Suitzerland on some nice mountain but there hasn't
+emerged a nice facility, yet.  Later in the year we might be able to do a
+sprint in Dublin and of course one right before EuroPython in Sweden.
+Btw, if someone want to offer helping to organize a sprint feel free 
+to contact us. 
+
+Also there was some talk on IRC that we might do a "virtual sprint" so
+that our non-european developers can more easily participate. This would
+probably mean doing screen-sessions and using some Voice-over-IP
+technology ... we'll see what will eventually evolve.  After all, we
+might also soon get information from the EU regarding our recent
+proposal which should make sprint planning easier in the long run.
+We'll see.
+
+For now i wish everyone some nice last days of the year which has 
+been a fun one regarding our ongoing pypy adventure ...
+
+cheers,
+
+    holger (who hopes he hasn't forgotten someone or something ...)
+

File sprintinfo/BerlinReport.txt

+a sprint report (actually a mail from Holger Krekel on pypy-dev)
+
+Hi Florian,
+
+[Florian Schulze Sat, Oct 04, 2003 at 10:34:25PM +0200]
+
+  Hi!
+
+  How well did the sprint work out?
+ 
+  I have seen that there is some pyrex code generation now and there are
+  tests, but what where the results in this area during the sprint?
+ 
+  Just a very short mail with some information would be grately
+  appreciated.
+
+Here is my take. Other mileages may vary so excuse me if i miss anything.
+
+On Monday morning we made a few design decision which led to the
+implementation of the following abstractions in the next two days:
+
+- a new FlowObjSpace which does abstract interpretation
+  plus some very nice tricks (which we came up with during a
+  long-winded discussion in a restaurant :-) to construct
+  a FunctionGraph.  This functiongraph (fully) represents the abstract
+  or symbolic execution of a function.   e.g. for this function::
+
+      def while_func(i):
+          total = 0
+          while i > 0:
+              total = total + i
+              i = i - 1
+          return total
+
+  the following graph is generated (shown here in an slightly
+  optimized version):
+
+        http://codespeak.net/~hpk/while_func.ps
+
+
+- the pyrex-translator also takes this objectmodel (in flowmodel.py) and
+  generates Pyrex-Code from it.  The generated code looks pretty low-level
+  but this is expected as we eventually want to generate C or assembly
+  directly.  For the above function the following pyrex-source code is
+  generated (again with some easy optimizations applied)::
+
+    def while_func(object v413):
+      v419, v420 = v413, 0
+      cinline "Label1:"
+      v422 = v419 > 0
+      if v422:
+        v424 = v420 + v419
+        v425 = v419 - 1
+        v419, v420 = v425, v424
+        cinline "goto Label1;"
+      else:
+        return v420
+
+  btw, the 'cinline' statement is a hack to pyrex and allows to insert
+  arbitrary C-code. An objectspace cannot really identify loops
+  and so we need "goto". We consider goto to be useful unless you have
+  to type and understand them manually :-)
+
+- translator/annotation.py also takes the flowmodel and applies a
+  new technique for type inference: it uses space-operations to
+  note 'assertions' about variables and relaxes those assertions
+  during analysis of the flowgraph.  IOW we didn't come up with
+  yet another type-system (which is the classical approach) but
+  reuse the notion of "space-operations" which we had from the beginning
+  of the project. Btw, Armin thinks that this type-inference algorithm
+  is worth a scientific paper but more about this either later and/or
+  from him.
+- we adapted Jonathan David Riehl's Python-Parser (written completly
+  in python using its own "rex"-approach) and adapted it so that
+  it will be a drop-in replacement for CPython's current parser
+  (living the boring life of a C-extension). Actually Jonathan's
+  larger 'basil' project is now in the codespeak-repository and
+  we can easily link it into PyPy or branch off it if neccessary.
+
+So alltogether the Flowgraph/Functiongraph/flowmodel (there is no
+completly fixed terminology yet) is the central point for several
+independent algorithms that - if combined - eventually produce typed C-code.
+
+To sum it up there are the following abstractions:
+
+============    ===============================================================
+interpreter     interpreting bytecode, dispatching operations on objects to
+objectspace     implementing operations on boxed objects
+stdobjspace     a concrete space implementing python's standard type system
+flowobjspace    a conrete space performing abstract/symbolic interpretation and
+                producing a (bytecode-indepedent) flowmodel of execution
+annotator       analysing the flowmodel to infer types.
+genpyrex        taking the (annotated) flowmodel to generate pyrex-code
+pyrex           translating into an C-extension
+============    ===============================================================
+
+As a consequence the former Ann(otation)Space has been ripped apart
+(partly into flowobjspace) and is gone now. Long live the flowspace.
+
+A really nice property of the above abstractions is that they allow
+development and testing *independently* from one another which was
+of invaluable help. Thanks here go to Greg Ewing for Pyrex and sorry
+for the evil cinline-hack :-)
+
+Anybody interested in helping with the next steps might look into
+the TODO file in the pypy-root directory.  We also have discussed
+yesterday evening a refactored flowmodel which we want to employ
+soon.
+
+Big thanks go to Tomek Meka and Christian Tismer for organizing the
+sprint and Stephan Diehl and Dinu Gherman for their help in various
+organizational areas. And especially to Jonathan David Riehl who
+made it from Chicago. We hope he can stay with us more often. And
+here is a (hopefully complete) list of people who attended and made
+all of the above possible:
+
+- Armin Rigo
+- Christian Tismer
+- Dinu Gherman
+- Guenter Jantzen
+- Jonathan David Riehl
+- Samuele Pedroni
+- Stephan Diehl
+- Tomek Meka
+
+and shame on me if i forgot anyone (i am tired ...)
+
+And of course many many thanks to Laura Creighton (AB Strakt),
+Nicolas Chauvat (Logilab) and Alistair Burt (DFKI) who tried hard to
+work with us on EU-funding-issues.  Actually we came up with a nice technical
+2-year plan but a lot of business issues still need to be resolved
+and fixed. Let's hope that the EU-funding effort is as successful as
+our coding sprints this year has been.  Ah yes, the next sprint we hope
+to do mid-december probably in Amsterdam.  If all goes well (some more
+people helping between the sprints that is :-) we might even do a first
+public release with PyPy prototypically running as a C-extension to CPython.
+
+That's it for now from me.  (sprinters: Please correct/fix any issues i
+misrepresented)
+
+cheers,
+
+    holger
+

File sprintinfo/HildesheimReport.txt

+Hildesheim Sprint Report
+========================
+
+The Hildesheim Sprint provided a chance to meet and decide several crucial design considerations. 
+A #pypy irc channel provided communication among participants between sprints. The sourcecode 
+was loaded into subversion and participants given commit rights.
+
+At the Sprint:
+Some folks did lots of handwaving and created really kewl concepts. Other pairs concentrated on coding, testing, builtin functions etc etc. We gathered for goalsetting meetings several times during the sprint, then split up to work on tasks. Half of the work was done by pair programming. Pairs were informal, developing and changing as tasks were discovered and completed. Sprints varied in amount of "discuss as a group" and "just do it" time. We spent lots of intense time together, not just coding but also social time, (meals, spending a day playing tourist, etc), which enhanced the building of relationships and understanding among sprinters.
+
+Some discoveries: Plan on the first morning for hardware setup and fixing system issues, (wireless is great!) Built-in private time is necessary for the sprint. Whiteboards and projectors are both necessary, as is coffee and tea. Bringing in/providing food is fine but getting people away for lunch is good to clear their minds. Leadership varied throughout the sprints and throughout the day.
+
+
+
+Brainstorming about what PyPy might be
+--------------------------------------
+
+The following was written down at the first Sprint to understand
+each other's motivations and ideas.  It's not very sorted but
+might still be interesting to skim. 
+
+- Python interpreter written in python
+    - loads bytecode
+    - delegates/dispatches to ObjectSpaces to implement operations
+      on the objects
+    - there can be more than one ObjectSpace
+        - for example: BorrowingObjectSpace (from CPython)
+    - define/implement a class that emulates the Python 
+      Execution Frame
+
+- use the main-loop of the interpreter to do a lot of
+  things (e.g. do type inference during running the bytecode 
+  or not even run the bytecodes, but interpret various attributes of the code) 
+
+- working together, producing something real
+
+- saving interpreter state to an image (following the smalltalk model)
+  process-migration / persistence
+
+- looking at the entire code base (living in an image), browsing 
+  objects interactively 
+
+- interactive environment, trying code snippets, introspection
+
+- deploying python made easy, integrate version control systems
+
+- integrate the various technologies on the web site, issue tracking,
+  Wiki...
+
+- seperate exception handling from the mainline code, avoid peppering
+  your code with try :-), put exception handling into objects.
+
+- import python code from the version control store directly, give
+  imported code also a time dimension 
+
+- combining python interpreters from multiple machines (cluster) into a
+  virtual sandbox (agent space?) 
+
+- get a smaller (maybe faster) python with very few C-code
+
+- (hoping for Psyc) to render fast code from Python code (instead of
+  hard-c)
+
+- go to a higher level python core (and write out/generate interpreters
+  in different languages), e.g. the former P-to-C resolved the evalframe-loop
+  but still called into the Python-C-library which is statically coded
+
+- very far fetched: PyPython becomes a/the reference implementation
+ 
+- have enough flexibility to make a separate stackless obsolete
+
+- have a language that is high-level/easy enough to program
+  but with the same performance as statically compiled languages
+  (e.g. C++)
+
+
+what is the difference between a compiler and an interpreter
+------------------------------------------------------------
+
+f = bytecode interpreter
+p = program
+a = arguments
+
+c = compiler
+
+assert f(p, a) == c(p)(a) == r
+
+
+-  architecture overview
+    * byte code interp loop
+      plan how the interp loop should look like from a hi level
+      map that structure in descriptions that can be used to generate interpreters/compilers
+      define the frame structure
+
+    * define a object/type model that maps into python data structures
+      wrap cpython objects into the new object model so we can continue
+      to use cpython modules
+    
+    * rewrite c python modules and the builtin object library in python
+      optimzation for a potential global python optimizer, until that
+      exists it will be slower than the corresponding cpython implementation
+
+- import the cpython distribution so we can use parts of it in our
+  repository, make it easy to follow the cpython development
+
+- finish the python to byte code compiler in python project (this is
+  already part of the cpython distribution, needs a python lexer)
+
+- doing other things than interpreting byte code from the python interp
+  loop, for example generate C code, implement other object spaces in our
+  terminlogy other far fetched things with execution
+
+- how to enter c ysystem calls into the python object space (ctypes)

File sprintinfo/LouvainLaNeuvePlan.txt

+Sprint Planning
+---------------
+
+Here is a list of things we might want to do at one of the next sprints.  
+Currently it's roughly what is left over from the last sprints. 
+
+- do more tests (eternal goal)
+
+- Fix XXX-marked things  (eternal goal)
+
+- enhance StdObjSpace, define goals and achieve them
+  http://codespeak.net/svn/pypy/trunk/src/goals/
+
+  - support the objects we see falling back to CPython.
+  - more builtins.
+  - more things from sys.
+  - dict object/type 
+    - Hash table based implementation of dictionaries?
+  - list object/type   
+  - write a small tool that checks a type's methods of
+    CPython against PyPy
+    (Jacob, Laura)  done
+
+- go through the wiki and clean up "stale" or old pages
+
+- implement the beginnings of a C code generator. the basic idea 
+  is "abstract interpretation"....
+
+- discuss funding and future organization issues
+
+---------------------------------------------------------------
+
+.. _boolobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000381.html
+.. _cpythonobj: http://codespeak.net/pipermail/pypy-svn/2003-June/000385.html
+.. _instmethod: http://codespeak.net/pipermail/pypy-svn/2003-June/000389.html
+.. _long: http://codespeak.net/pipermail/pypy-svn/2003-June/000410.html
+.. _sliceobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000408.html
+.. _userobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000449.html
+.. _dictobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000515.html
+.. _intobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000443.html
+.. _instmethod: http://codespeak.net/pipermail/pypy-svn/2003-June/000389.html
+.. _iterobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000529.html

File sprintinfo/LouvainLaNeuveReport.txt

+LouvainLaNeuveSprint summary report
+-----------------------------------
+
+- enhanced/reviewed standard object space 
+
+  - reviewed, improved and enriched object implementations
+    mostly done by Christian, Alex 
+    boolobject_, cpythonobject_, instmethodobject_, longobject_ (removed),
+    sliceobject_, userobject_, dictobject_, iterobject_.
+
+  - stringobject was completed with lots of tests 
+    (Tomek, Guenter)
+
+  - various improvements/bugfixes in a lot of objects/types
+
+  - implemented tool/methodChecker.py to examine which methods 
+    are existing on the types of the standard object space 
+    (Jacob)
+
+- implemented language features
+    - implemented nested scopes (Michael, Holger)
+      dissassociated the frame.locals implementation from
+      the dict objectspace implementation (Guido, Holger)
+
+    - implemented generators (Michael, Holger)  in Std,Triv space
+
+    - implemented unbound methods and descriptors (Michael, Samuele)
+
+    - first cut at implementing the right __new__/__init__ semantics 
+      (Armin, Samuele)
+
+    - use intepreter-level Python class inheritance for structure
+      sharing for user subclasses of builtin types, switched to an
+      indipendent hierarchy for mm-dispatch purposes (dispatchclass attr)
+      (Samuele, design discussion with Armin)
+
+- implemented the beginnings of the AnnotationObjectSpace
+  (Armin, Guido, Michael) for abstract interpretation.
+
+- added lots of tests (all of us)
+
+- refactoring of argument-parsing for tool/test.py 
+  and introduction of the "py.py" tool that unifies 
+  executing commands, files and going interactive. 
+  (Michael)
+
+- refactoring, improvements of multimethods (Armin, Samuele)
+
+- documentation was restructured and transfered from
+  the wiki to subversion. The documents are now in reST-format 
+  Also improvements and crossreferences between the
+  documents. (Anna)
+  a trigger was implemented that generates the new html-pages after 
+  each checkin in the pypy/trunk/doc directory. (Holger)
+
+- OSCON-2003 paper was beeing worked on and enhanced! 
+  (Laura, Jacob, Guido, Armin, ...)
+
+- we had a discussion about EU-Funding. The EU currently
+  puts forward a Call for Proposals which apparently fits
+  the goals of pypy-development very much. There is interest 
+  among current developers to go in this direction.
+
+- bugfixes, refactorings and adding tests all over the place 
+  (everybody)
+
+---------------------------------------------------------------
+
+.. _boolobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000381.html
+.. _cpythonobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000385.html
+.. _instmethodobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000389.html
+.. _longobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000410.html
+.. _sliceobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000408.html
+.. _userobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000449.html
+.. _dictobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000515.html
+.. _intobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000443.html
+.. _instmethod: http://codespeak.net/pipermail/pypy-svn/2003-June/000389.html
+.. _iterobject: http://codespeak.net/pipermail/pypy-svn/2003-June/000529.html
+

File sprintinfo/VilniusSprintAnnouncement.txt

+Hi Pythonistas and interested developers,
+
+PyPy, the python-in-python implementation, is steadily moving
+on.  The next coding sprint will take place in Vilnius,
+Lithunia, from
+
+    15th to 23rd of November, 2004
+
+and is organized by the nice Programmers of Vilnius (POV)
+company.  See http://codespeak.net/pypy/index.cgi?doc for more
+in-depth information about PyPy.
+
+Again, we will be heading towards a first generated C version
+of our already pretty compliant Python interpreter and types
+implementation.  Last time, before the EuroPython 2004
+conference, we actually had a similar goal (a PyPy/C-version) but
+discovered we had to largely refactor the basic model for
+attribute accesses. We are now closely mirroring the marvelous
+"descriptor"-mechanism of CPython.   
+
+If you are interested to participate in our fun and somewhat
+mind-altering python sprint event then please subscribe at
+
+  http://codespeak.net/moin/pypy/moin.cgi/VilniusSprintAttendants
+
+and look around for more information. You'll find that most of
+the core PyPy developers are already determined to come. There
+are also many areas that need attention so that we should
+have tasks suited for different levels of expertise. 
+
+At http://codespeak.net/moin/pypy/moin.cgi/VilniusSprintTasks
+you'll find our sprint planning task list which will probably 
+grow in the next weeks.
+
+Note that our EU funding efforts are at the final stage now.
+In the next weeks, quite likely before the Vilnius sprint, we
+_hope_ to get a "go!" from the european commission.  One side
+effect would be that coders - probably restricted to european
+citizens - may generally apply for getting travel and
+accomodation costs refunded for PyPy sprints.  This would
+reduce the barrier of entry to the question if you like to
+spend your time with a pypy sprint.  However, we probably need
+some time to work out the details after when we get more
+information from the EU. 
+
+If you have any questions don't hesitate to contact
+pypy-sprint@codespeak.net or one of us personally.
+
+cheers & a bientot,
+
+    Holger Krekel, Armin Rigo 

File sprintinfo/checklist.txt

+PyPy Sprint Checklist
+=====================
+
+This is a hands-on checklist for organizing a PyPy sprint. 
+Usually there is a least one person driving the sprint from 
+the developer's side.  And one or more people preparing the
+event locally (organizing room + internet, providing some
+information about local oddities etc. :-)
+
+pre (usually done by a local organizer + a pypy-developer)
+----------------------------------------------------------
+
+- find someone who can organize locally a good place with (at least)
+  these characteristics:
+
+  - for at least 12 people
+
+  - with an permananent internet connection
+
+  - with enough chairs and tables
+
+  - with a nearby possibility to get food
+
+  - facilities to make coffee and tea
+
+  - a whiteboard or something to draw pictures on (in a shared session)
+
+  - a beamer, if possible (otherwise try to bring it externally)
+
+- discuss goals with developers and (pre-)announce sprint
+
+  - first clear the list of pypy-sprint subscribers
+
+  - pre-announce to pypy-dev telling shortly about the planned event,
+    goals in order to gather feedback if the planned event will get enough
+    "insiders" interested.
+
+  - one developer should (after appropriate discussion) make the announcement 
+    to c.l.py & python-dev plus any lists which might be interested for that 
+    specific event (like e.g. lispers :-). Don't forget to mention all 
+    the helpful local people who make this event happen!
+
+
+logistics (usually done by the local organizer)
+-----------------------------------------------
+
+- make sure there is a detailed description how to get to the sprint 
+  location by train, plane, car
+
+- try to help people to find accomodation (it's usually a lot easier for
+  a local to judge/recommend (or organize in the best case) accomodation 
+  than for all the people from abroad)
+
+- if neccessary recommend a week-ticket and some public transport plan
+  so that people know how to move in that city. 
+
+
+technical advice for sprinters
+------------------------------
+
+- subscribe to pypy-dev
+
+- everyone has to have an account on codespeak and should check *BEFORE* 
+  the sprint if everything works (including subversion-checkouts/checkins)
+
+- follow the discussion on pypy-sprint/pypy-dev at least the two weeks
+  before the sprint :-)
+
+at the sprint
+-------------
+
+- to be done :-)

File to_edu_sig.txt

+Happy New Year, Dethe!  Thank you for your interest.  You write:
+
+  >PyPy uses a similar approach to Pyrex, called Psyco, which compiles
+  >directly to 80x86 machine code (Pyrex compiles to cross-platform C
+  >code).  This allows PyPy to attempt to be faster than C-Python by
+  >creating compiler optimizations.  Not sure what the PyPy story is for
+  >non-x86 platforms.  There is also a project to recreate Python on top
+  >of Smalltalk, by L. Peter Deutch, which he expects to be faster than
+  >C-Python (and if anyone can do it, he could).
+
+  >Nice to see y'all again.  Happy New Year (or Gnu Year?).
+
+  x>--Dethe
+
+
+I'd like to clarify a few misunderstandings I think you have.  Psyco
+is not a technique, but rather a specialising compiler available as a
+Python extension module.  It compiles directly to 386 machine
+code. PyPy, on the other hand, currently emits C code.  Our previous
+version emitted Pyrex code.  Some people in Korea are making a version
+that emits Lisp code.  PyPy doesn't use Psyco, though many ideas are
+common to both.
+
+What's more there is nothing magic about machine code that makes it
+automatically fast -- an inefficiently-coded algorithm in assembler is
+still a turtle.  The win in using PyPy is not about 'saving the time
+it takes to have a conversation with your C compiler', but instead
+about making such conversations more productive and useful.  The more
+information you can tell your C compiler about your data, the better
+code it is prepared to generate.  This is the tradeoff between
+flexibility and speed.
+
+When your Python system sees x = a + b it has no clue as to what types
+a and b are.  They could be anything. This 'being ready to handle
+everything' has a large performance penalty.  The runtime system has
+to do be prepared to do a _lot_, so it has to be fairly intelligent.
+All this intelligence is in the form of code instructions, and there
+are a lot of them that the runtime system has to execute, every time
+it wants to do anything at all.  On the other hand, at code _reading_
+time, the Python interpreter is purposefully stupid-but-straightforward.  
+It doesn't have much to do, and so can be relatively quick about not doing it.
+
+A statically-typed compiled language works in precisely the other way.
+When the runtime system of a statically typed language sees x = a + b,
+it already knows all about x, a and b and their types.  All the hard
+work was done in the compiling phase.  There is very little left to
+worry about -- you might get an overflow exception or something -- but
+as an oversimplification, all the runtime system of a statically typed
+langauge has to know how to do is how to load and run.  That's fast.
+
+So, one way you could speed up Python is to add type declarations,
+thus simplifying the life of the runtime system.  This proposed
+solution is more than a little drastic for those of us who like duck
+typing, signature based polymorphism, and the particular way coding in
+Python makes you think and feel.
+
+The PyPy alternative is to make the interpreter even smarter, and a
+whole lot better at remembering what it is doing. For instance, when
+it sees x = a + b instead of just adding this particular int to this
+particular int, it could generate some general purpose code for adding
+ints to ints.  This code could be thus used for all ints that you wish
+to add this way.  So while the first time is slow, the _next_ time
+will be a lot faster.  And that is where the performance speedup
+happens -- in code where the same lines get run again, and again, and
+again.  If all you have is a mainline where each line of code gets
+executed once, then we won't help you at all.
+
+Psyco is about as far as you can get with this approach and be left
+with a module you can import and use with regular python 2.2 or
+better.  See: http://psyco.sourceforge.net/ PyPy is what you get when
+you pitch the old interpreter and write your own.  See:
+http://codespeak.net/pypy/
+
+And we should probably move further disussion to pypy-dev, here:
+http://codespeak.net/mailman/listinfo/pypy-dev
+
+Thanks for your interest, and thanks for writing,
+Laura Creighton (for PyPy)
+