Commits

Anonymous committed fd0f614 Draft

Initial import of all my implementation-less language specs.

Comments (0)

Files changed (8)

didigm/doc/didigm.html

+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<!-- encoding: UTF-8 -->
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+<title>The Didigm Reflective Cellular Automaton</title>
+  <!-- begin html doc dynamic markup -->
+  <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
+  <script type="text/javascript" src="/scripts/documentation.js"></script>
+  <!-- end html doc dynamic markup -->
+</head>
+<body>
+
+<h1>The Didigm Reflective Cellular Automaton</h1>
+
+<p>November 2007, Chris Pressey, Cat's Eye Technologies</p>
+
+<h2>Introduction</h2>
+
+<p>Didigm is a <em>reflective cellular automaton</em>.  What I mean to
+impart by this phrase is that it is a cellular automaton where the
+transition rules are given by the very patterns of cells that exist
+in the playfield at any given time.</p>
+
+<p>Perhaps another way to think of Didigm is:
+Didigm =
+<a href="/projects/alpaca/">ALPACA</a> +
+<a href="/projects/ypsilax/">Ypsilax</a>.</p>
+
+<h2>Didigm as Parameterized Language</h2>
+
+<p>Didigm is actually a parameterized language.  A parameterized
+language is a schema for specifying a set of languages, where a specific language
+can be obtained by supplying one or more parameters.  For example,
+<a href="/projects/xigxag/">Xigxag</a> is a parameterized language,
+where the direction of the scanning and the direction of the building
+of new states are parameters.  Didigm "takes" a single parameter, an integer.
+This parameter determines how many <em>colours</em> (number of possible
+states for any given cell) the cellular automaton has.  This parameter
+defaults to 8.  So when we say Didigm, we actually mean
+Didigm(8), but there are an infinite number of other possible
+languages, such as Didigm(5) and Didigm(70521).</p>
+
+<p>The languages Didigm(0), Didigm(-1), Didigm(-2)
+and so forth are probably nonsensical abberations, but I'll leave that
+question for the philosophers to ponder.
+Didigm(1) is at least well-defined, but it's trivial.
+Didigm(2) and Didigm(3) are semantically problematic for
+more technically interesting reasons (Didigm(3) might be CA-universal,
+but Didigm(2) probably isn't.)
+Didigm(4) and above are easily shown to be CA-universal.</p>
+
+<p>
+(I say CA-universal, and not Turing-complete, because technically
+cellular automata cannot simulate Turing machines without some
+extra machinery: TMs can halt, but CAs can't.  Since I don't want
+to deal with defining that extra machinery in Didigm, it's simpler to
+avoid it for now.)
+</p>
+
+<p>Colours are typically numbered.  However, this is not meant to imply
+an ordering between colours.  The eight colours of Didigm
+are typically referred to as 0 to 7.</p>
+
+<h2>Language Description</h2>
+
+<h3>Playfield</h3>
+
+<p>The Didigm playfield, called <em>le monde</em>,
+is considered unbounded, like most cellular automaton
+playfields, but there is one significant difference.
+There is a horizontal division in this playfield, splitting
+it into regions called <em>le ciel</em>, on top,
+and <em>la terre</em>, below.  This division is
+distinguishable — meaning, it must be possible to tell which region
+a given cell is in — but it need not have a presence beyond that.
+Specifically, this division lies on the edges between cells, rather than in
+the cells themselves.  It has no "substance" and need not be visible
+to the user.  (The Didigm Input Format, below, describes how it
+may be specified in textual input files.)</p>
+
+<h3>Magic Colours</h3>
+
+<p>Each region of the division has a distinguished colour which is
+called the <em>magic colour</em> of that region.  The magic colour of le ciel is
+colour 0.  The magic colour of la terre is colour 7.
+(In Didigm(<var>n</var>),
+the magic colour of la terre is colour <var>n</var>-1.)</p>
+
+<h3>Transition Rules</h3>
+
+<h4>Definition</h4>
+
+<p>Each transition rule of the cellular automaton is not fixed,
+rather, it is given by certain forms that are present in the playfield.</p>
+
+<p>Such a form is called <em>une salle</em> and has the following configuration.
+Two horizontally-adjacent cells of the magic colour abut a cell of the <em>destination
+colour</em> to the right.  Two cells below the rightmost magic-colour
+cell is the cell of the <em>source colour</em>; it is surrounded by cells of
+any colour called the <em>determiners</em>.</p>
+
+<p>This is perhaps better illustrated than explained.  In the following diagram,
+the magic colour is 0 (this salle is in le ciel,)
+the source colour is 1, the destination colour is 2, and the determiners are
+indicated by D's.</p>
+
+<pre>
+002
+DDD
+D1D
+DDD
+</pre>
+
+<h4>Application</h4>
+
+<p>Salles are interpreted as transition rules as follows.  When the colour
+of a given cell is the same as the source colour of some salle, and when
+the colours of all the cells surrounding that cell are the exact same
+colours (in the exact same pattern) as the determininers of that salle,
+we say that that salle <em>matches</em> that cell.  When any cell is
+matched by some salle in the other region, we say that that salle
+<em>applies</em> to that cell, and that cell is replaced
+by a cell of the destination colour of that salle.</p>
+
+<p>"The other region" refers, of course, to the region that is not the
+region in which the cell being transformed resides.  Salles in la terre
+only apply to cells in le ciel and vice-versa.  This complementarity
+serves to limit the amount of chaos: if there was some salle that applied to
+<em>all</em> cells, it would apply directly to the cells that made up that
+salle, and that salle would be immediately transformed.</p>
+
+<p>On each "tick" of the cellular automaton, all cells are checked to
+find the salle that applies to them, and then all are transformed, simultaneously,
+resulting in the next configuration of le monde.</p>
+
+<p>There is a "default" transition rule which also serves to limit
+the amount of chaos: if no salle applies to a cell, the colour of that
+cell does not change.</p>
+
+<p>Salles may overlap.  However, no salle may straddle the horizon.
+(That is, each salle must be either completely in le ciel or completely
+in la terre.)</p>
+
+<p>Salles may conflict (i.e. two salles may have the same source colour
+and determiners, but different destination colours.)  The behaviour in
+this case is defined to be uniformly random: if there are <var>n</var>
+conflicting salles, each has a 1/<var>n</var> chance of being the
+one that applies.</p>
+
+<h2>Didigm Input Format</h2>
+
+<p>I'd like to give some examples, but first I need a format to given them in.</p>
+
+<p>A Didigm Input File is a text file.  The textual digit symbols
+<code>0</code> through <code>9</code> indicate cells of colours 0 through 9.
+Further colours may be indicated by enclosing a decimal digit string in square brackets,
+for example <code>[123]</code>.  This digit string may contain leading
+zeros, in order for columns to line up nicely in the file.</p>
+
+<p>A line containing only a <code>,</code> symbol in the leftmost column
+indicates the division between le ciel and la terre.  This line does not
+become part of the playfield.</p>
+
+<p>A line beginning with a <code>=</code> is a directive of some sort.</p>
+
+<p>A line beginning with <code>=C</code> followed by a colour indicator
+indicates how many colours (the <var>n</var> in Didigm(<var>n</var>))
+this playfield contains.  This directive may only occur once.</p>
+
+<p>A line beginning with <code>=F</code> followed by a colour indicator
+as described above, indicates that the unspecified (and unbounded) remainder
+of le ciel or la terre (whichever side of <code>,</code> the directive is on)
+is to be considered filled with cells of the given colour.</p>
+
+<p>Of course, an application which implements Didigm with some
+alternate means of specifying le monde, for example a graphical user interface,
+need not understand the Didigm Input Format.</p>
+
+<h2>Examples</h2>
+
+<p>Didigm is immediately seen to be CA-universal, in that you can
+readily (and stably) express a number of known CA-universal cellular automata in it.
+For example, to express John Conway's Life, you could say that
+colour 1 means "alive" and colour 2 means "dead", and compose something like</p>
+
+<pre>
+002002001001
+222122112212 <i>... and so on ...</i>
+212212212121 <i>... for all 256 ...</i>
+222222222222 <i>... rules of Life ...</i>
+=F3
+,
+=F2
+22222
+21222
+21212
+21122
+22222
+</pre>
+
+<p>Because the magic colour 7 never appears in la terre, il n'y a aucune salle
+dans la terre et donc tout le ciel est toujours la meme chose.</p>
+
+<p>There are of course simpler CA's that are apparently CA-universal that
+would be possible to describe more compactly.  But more interesting (to me)
+is the possibility for making reflective CA's.</p>
+
+<p>To do this in an uncontrolled fashion is easy.  We just stick some salles
+in le ciel, some salles in la terre, and let 'er rip.  Unfortunately, in general,
+les salles in each region will probably quickly damage enough of the salles in
+the other region that le monde will become sterile soon enough.</p>
+
+<p>A rudimentary example of something a little more orchestrated follows.</p>
+
+<pre>
+3333333333333
+3002300230073
+3111311132113
+3311321131573
+3111311131333
+3333333333333
+=F3
+,
+=F1
+111111111111111
+111111131111111
+111111111111574
+111111111111333
+311111111111023
+111111111111113
+</pre>
+
+
+<p>The intent of this is that the 3's in la terre initially grow
+streams of 2's to their right, due to the leftmost two salles in le ciel.
+However, when the top stream of 2's reaches the cell just above and to the
+left of the 5, the third salle in le ciel matches and turns the 5 into a
+7, forming une salle dans la terre.  This salle turns every 2 to the right of
+a 0 in le ciel into a 4, thus modifying two of les salles in le ciel.
+The result of these modified salles is to turn the bottom stream of 2's
+into a stream of 4's halfway along.</p>
+
+<p>This is at least predictable, but it still becomes uninteresting fairly quickly.
+Also note that it's not just the isolated 3's in la terre that grow
+streams of 2's to the right: the 3's on the right side of la salle
+would, too.  This could be rectified by having a wall of some other colour
+on that side of la salle, and I'm sure you could extend this example
+by having something else happen when the stream of 4's hit the 0 in that
+salle, but you get the picture.  Creating a neat and tidy and long-lived
+reflective cellular automaton requires at least as much care as
+constructing a "normal" cellular automaton, and probably in general more.</p>
+
+<h2>History</h2>
+
+<p>I came up with the concept of a reflective cellular automaton
+(which is, as far as I'm aware, a novel concept) independently on
+November 1st, 2007, while walking on Felix Avenue, in Windsor, Ontario.</p>
+
+<p>No reference implementation exists yet.  Therefore, all Didigm runs
+have been thought experiments, and it's entirely possible that I've missed
+something in its definition that having a working simulator would reveal.</p>
+
+<p>Happy magic colouring!
+<br/>Chris Pressey
+<br/>Chicago, Illinois
+<br/>November 17, 2007</p>
+
+</body>
+</html>

madison/doc/Madison.falderal

+Madison
+=======
+
+Version 0.1
+December 2011, Chris Pressey, Cat's Eye Technologies
+
+Abstract
+--------
+
+Madison is a language in which one can state proofs of properties
+of term-rewriting systems.  Classical methods of automated reasoning,
+such as resolution, are not used; indeed, term-rewriting itself is
+used to check the proofs.  Both direct proof and proof by induction
+are supported.  Induction in a proof must be across a structure which
+has a well-founded inductive definition.  Such structures can be
+thought of as types, although this is largely nominal; the traditional
+typelessness of term-rewiting systems is largely retained.
+
+Term-rewriting
+--------------
+
+Madison has at its core a simple term-rewriting language.  It is of
+a common form which should be unsurprising to anyone who has worked
+at all with term rewriting.  A typical simple program contains
+a set of rules and a term on which to apply those rules.  Each
+rule is a pair of terms; either term of the pair may contain
+variables, but any variable that appears on the r.h.s must also
+appear on the l.h.s.  A rule matches a term if is the same as
+the term with the exception of the variables, which are bound
+to its subterms; applying a matching rule replaces the term
+with the r.h.s. of the rule, with the variables expanded approp-
+riately.  Rules are applied in an innermost, leftmost fashion to
+the term, corresponding to eager evaluation.  Rewriting terminates
+when there is no rule whose l.h.s. matches the current incarnation
+of the term being rewritten.
+
+A term is either an atom, which is a symbol that stands alone,
+or a constructor, which is a symbol followed by a comma-separated list
+of subterms enclosed in parentheses.  Symbols may consist of letters,
+digits, and hyphens, with no intervening whitespace.  A symbol is
+a variable symbol if it begins with a capital letter.  Variable
+symbols may also begin with underscores, but these may only occur
+in the l.h.s. of a rewrite rule, to indicate that we don't care
+what value is bound to the variable and we won't be using it on
+the r.h.s.
+
+(The way we are using the term "constructor" may be slightly non-
+standard; in some other sources, this is called a "function symbol",
+and a "constructor" is a subtly different thing.)
+
+Because the rewriting language is merely a component (albeit the
+core component) of a larger system, the aforementioned typical
+simple program must be cast into some buttressing syntax.  A full
+program consists of a `let` block which contains the rules
+and a `rewrite` admonition which specifies the term to be re-
+written.  An example follows.
+
+| let
+|   leftmost(tree(X,Y)) -> leftmost(X)
+|   leftmost(leaf(X))   -> X
+| in
+|   rewrite leftmost(tree(tree(leaf(alice),leaf(grace)),leaf(dan)))
+= alice
+
+In the above example, there are two rules for the constructor
+`leftmost/1`.  The first is applied to the outer tree to obtain
+a new leftmost constructor containing the inner tree; the first
+is applied again to obtain a new leftmost constructor containing
+the leaf containing `alice`; and the second is applied to that
+leaf term to obtain just `alice`.  At that point, no more rules
+apply, so rewriting terminates, yielding `alice`.
+
+Madison is deterministic; if rules overlap, the first one given
+(syntactically) is used.  For this reason, it is a good idea
+to order rules from most specific to least specific.
+
+I used the phrase "typical simple program" above because I was
+trying intentionally to avoid saying "simplest program".  In fact,
+technically no `let` block is required, so you can write some
+really trivial Madison programs, like the following:
+
+| rewrite cat
+= cat
+
+I think that just about covers the core term-rewriting language.
+Term-rewriting is Turing-complete, so Madison is too.  If you
+wish to learn more about term rewriting, there are several good
+books and webpages on the subject; I won't go into it further
+here.
+
+Proof-Checking
+--------------
+
+My desire with Madison was to design a language in which you
+can prove things.  Not a full-blown theorem prover -- just a
+proof checker, where you supply a proof and it confirms either
+that the proof holds or doesn't hold.  (Every theorem prover
+has at its core a proof checker, but it comes bundled with a lot of
+extra machinery to search the space of possible proofs cleverly,
+looking for one which will pass the proof-checking phase.)
+
+It's no coincidence that Madison is built on top of a term-rewriting
+language.  For starters, a proof is very similar to the execution
+trace of a term being rewritten.  In each of the steps of the proof,
+the statement to be proved is transformed by replacing some part
+of it with some equally true thing -- in other words, rewritten.
+In fact, Post Canonical Systems were an early kind of rewriting
+system, devised by Emil Post to (as I understand it) illustrate this
+similarity, and to show that proofs could be mechanically carried out
+in a rewriting system.
+
+So: given a term-rewriting language, we can give a trivial kind
+of proof simply by stating the rewrite steps that *should* occur
+when a term is rewritten, and check that proof by rewriting the term
+and confirming that those were in fact the steps that occurred.
+
+For the purpose of stating these sequences of rewrite steps to be
+checked, Madison has a `theorem..proof..qed` form.  To demonstrate
+this form, let's use Madison to prove that 2 + 2 = 4, using Peano
+arithmetic.
+
+| let
+|   add(s(X),Y) -> add(X,s(Y))
+|   add(z,Y)    -> Y
+| in theorem
+|   add(s(s(z)),s(s(z))) ~> s(s(s(s(z))))
+| proof
+|   add(s(s(z)),s(s(z)))
+|   -> add(s(z),s(s(s(z)))) [by add.1]
+|   -> add(z,s(s(s(s(z))))) [by add.1]
+|   -> s(s(s(s(z))))        [by add.2]
+| qed
+= true
+
+The basic syntax should be fairly apparent.  The `theorem` block
+contains the statement to be proved.  The `~>` means "rewrites
+in zero or more steps to".  So, here, we are saying that 2 + 2
+(in Peano notation) rewrites, in zero or more steps, to 4.
+
+The `proof` block contains the actual series of rewrite steps that
+should be carried out.  For elucidation, each step may name the
+particular rule which is applied to arrive at the transformed term
+at that step.  Rules are named by their outermost constructor,
+followed by a dot and the ordinal position of the rule in the list
+of rules.  These rule-references are optional, but the fact that
+the rule so named was actually used to rewrite the term at that step
+could be checked too, of course.  The `qed` keyword ends the proof
+block.
+
+Naturally, you can also write a proof which does not hold, and
+Madison should inform you of this fact.  2 + 3, for example,
+does not equal 4, and it can pinpoint exactly where you went
+wrong should you come to this conclusion:
+
+| let
+|   add(s(X),Y) -> add(X,s(Y))
+|   add(z,Y)    -> Y
+| in theorem
+|   add(s(s(z)),s(s(s(z)))) ~> s(s(s(s(z))))
+| proof
+|   add(s(s(z)),s(s(s(z))))
+|   -> add(s(z),s(s(s(s(z))))) [by add.1]
+|   -> add(z,s(s(s(s(z)))))    [by add.1]
+|   -> s(s(s(s(z))))           [by add.2]
+| qed
+? Error in proof [line 6]: step 2 does not follow from applying [add.1] to previous step
+
+Now, while these *are* proofs, they don't tell us much about the
+properties of the terms and rules involved, because they are not
+*generalized*.  They say something about a few fixed values, like
+2 and 4, but they do not say anything about any *infinite*
+sets of values, like the natural numbers.  Now, that would be *really*
+useful.  And, while I could say that what you've seen of Madison so far
+is a proof checker, it is not a very impressive one.  So let's take
+this further.
+
+Quantification
+--------------
+
+To state a generalized proof, we will need to introduce variables,
+and to have variables, we will need to be able to say what those
+variables can range over; in short, we need *quantification*.  Since
+we're particularly interested in making statements about infinite
+sets of values (like the natural numbers), we specifically want
+*universal quantification*:
+
+    For all x, ...
+
+But to have universal quantification, we first need a *universe*
+over which to quantify.  When we say "for all /x/", we generally
+don't mean "any and all things of any kind which we could
+possibly name /x/".  Rather, we think of /x/ as having a type of
+some kind:
+
+    For all natural numbers x, ...
+
+Then, if our proof holds, it holds for all natural numbers.
+No matter what integer value greater than or equal to zero
+we choose for /x/, the truism contained in the proof remains true.
+This is the sort of thing we want in Madison.
+
+Well, to start, there is one glaringly obvious type in any
+term-rewriting language, namely, the term.  We could say
+
+    For all terms t, ...
+
+But it would not actually be very interesting, because terms
+are so general and basic that there's not actually very much you
+can say about them that you don't already know.  You sort of need
+to know the basic properties of terms just to build a term-rewriting
+language (like the one at Madison's core) in the first place.
+
+The most useful property of terms as far as Madison is concerned is
+that the subterm relationship is _well-founded_.  In other words,
+in the term `c(X)`, `X` is "smaller than" `c(X)`, and since terms are
+finite, any series of rewrites which always results in "smaller" terms
+will eventually terminate.  For completeness, we should probably prove
+that rigorously, but for expediency we will simply take it as a given
+fact for our proofs.
+
+Anyway, to get at something actually interesting, we must look further
+than the terms themselves.
+
+Types
+-----
+
+What's actually interesting is when you define a restricted
+set of forms that terms can take, and you distinguish terms inside
+this set of forms from the terms outside the set.  For example,
+
+| let
+|   boolean(true)  -> true
+|   boolean(false) -> true
+|   boolean(_)     -> false
+| in
+|   rewrite boolean(false)
+= true
+
+We call a set of forms like this a _type_.  As you can see, we
+have basically written a predicate that defines our type.  If any
+of the rewrite rules in the definition of this predicate rewrite
+a given term to `true`, that term is of our type; if it rewrites
+to `false`, it is not.
+
+Once we have types, any constructor may be said to have a type.
+By this we mean that no matter what subterms the constructor has,
+the predicate of the type of which we speak will always reduce to
+`true` when that term is inserted in it.
+
+Note that using predicates like this allows our types to be
+non-disjoint; the same term may reduce to true in two different
+predicates.  My first sketches for Madison had disjoint types,
+described by rules which reduced each term to an atom which named
+the type of that term.  (So the above would have been written with
+rules `true -> boolean` and `false -> boolean` instead.)  However,
+while that method may be, on the surface, more elegant, I believe
+this approach better reflects how types are actually used in
+programming.  At the end of the day, every type is just a predicate,
+and there is nothing stopping 2 from being both a natural number and
+an integer.  And, for that matter, a rational number and a real
+number.
+
+In theory, every predicate is a type, too, but that's where things
+get interesting.  Is 2 not also an even number, and a prime number?
+And in an appropriate (albeit contrived) language, is it not a
+description of a computation which may or may not always halt?
+
+The Type Syntax
+---------------
+
+The above considerations motivate us to be careful when dealing
+with types.  We should establish some ground rules so that we
+know that our types are useful to universally quantify over.
+
+Unfortunately, this introduces something of a chicken-and-egg
+situation, as our ground rules will be using logical connectives,
+while at the same time they will be applied to those logical
+connectives to ensure that they are sound.  This is not, actually,
+a big deal; I mention it here more because it is interesting.
+
+So, the rules which define our type must conform to certain
+rules, themselves.  While it would be possible to allow the
+Madison programmer to use any old bunch of rewrite rules as a
+type, and to check that these rules make for a "good" type when
+such a usage is seen -- and while this would be somewhat
+attractive from the standpoint of proving properties of term-
+rewriting systems using term-rewriting systems -- it's not strictly
+necessary to use a descriptive approach such as this, and there are
+certain organizational benefits we can achieve by taking a more
+prescriptive tack.
+
+Viz., we introduce a special syntax for defining a type with a
+set of rules which function collectively as a type predicate.
+Again, it's not strictly necessary to do this, but it does
+help organize our code and perhaps our thoughts, and perhaps make
+an implementation easier to build.  It's nice to be able to say,
+yes, what it means to be a `boolean` is defined right here and
+nowhere else.
+
+So, to define a type, we write our type rules in a `type..in`
+block, like the following.
+
+| type boolean is
+|   boolean(true)  -> true
+|   boolean(false) -> true
+| in
+|   rewrite boolean(false)
+= true
+
+As you can see, the wildcard reduction to false can be omitted for
+brevity.  (In other words, "Nothing else is a boolean" is implied.)
+And, the `boolean` constructor can be used for rewriting in a term
+just like any other plain, non-`type`-blessed rewrite rule.
+
+| type boolean is
+|   boolean(true)  -> true
+|   boolean(false) -> true
+| in
+|   rewrite boolean(tree(leaf(sabrina),leaf(joe)))
+= false
+
+Here are the rules that the type-defining rules must conform to.
+If any of these rules are violated in the `type` block, the Madison
+implementation must complain, and not proceed to try to prove anything
+from them.
+
+Once a type is defined, it cannot be defined further in a regular,
+non-type-defining rewriting rule.
+
+| type boolean is
+|   boolean(true)  -> true
+|   boolean(false) -> true
+| in let
+|   boolean(red) -> green
+| in
+|   rewrite boolean(red)
+? Constructor "boolean" used in rule but already defined as a type
+
+The constructor in the l.h.s. must be the same in all rules.
+
+| type foo is
+|   foo(bar) -> true
+|   baz(bar) -> true
+| in
+|   rewrite cat
+? In type "foo", constructor "bar" used on l.h.s. of rule
+
+The constructor used in the rules must be arity 1 (i.e. have exactly
+one subterm.)
+
+| type foo is
+|   foo(bar,X) -> true
+| in
+|   rewrite cat
+? In type "foo", constructor has arity greater than one
+
+It is considered an error if the predicate rules ever rewrite, inside
+the `type` block, to anything besides the atoms `true` or `false`.
+
+| type foo is
+|   foo(bar) -> true
+|   foo(tree(X)) -> bar
+| in
+|   rewrite cat
+? In type "foo", rule reduces to "bar" instead of true or false
+
+The r.h.s.'s of the rules of the type predicate must *always*
+rewrite to `true` or `false`.  That means, if we can't prove that
+the rules always rewrite to something, we can't use them as type
+predicate rules.  In practice, there are a few properties that
+we insist that they have.
+
+They may involve type predicates that have previously been
+established.
+
+| type boolean is
+|   boolean(true)  -> true
+|   boolean(false) -> true
+| in type boolbox is
+|   boolbox(box(X)) -> boolean(X)
+| in
+|   rewrite boolbox(box(true))
+= true
+
+They may involve certain, pre-defined rewriting rules which can
+be thought of as operators on values of boolean type (which, honestly,
+is probably built-in to the language.)  For now there is only one
+such pre-defined rewriting rule: `and(X,Y)`, where `X` and `Y` are
+booleans, and which rewrites to a boolean, using the standard truth
+table rules for boolean conjunction.
+
+| type boolean is
+|   boolean(true)  -> true
+|   boolean(false) -> true
+| in type boolpair is
+|   boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
+| in
+|   rewrite boolpair(pair(true,false))
+= true
+
+| type boolean is
+|   boolean(true)  -> true
+|   boolean(false) -> true
+| in type boolpair is
+|   boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
+| in
+|   rewrite boolpair(pair(true,cheese))
+= false
+
+Lastly, the r.h.s. of a type predicate rule can refer to the self-same
+type being defined, but *only* under certain conditions.  Namely,
+the rewriting must "shrink" the term being rewritten.  This is what
+lets us inductively define types.
+
+| type nat is
+|   nat(z)    -> true
+|   nat(s(X)) -> nat(X)
+| in
+|   rewrite nat(s(s(z)))
+= true
+
+| type nat is
+|   nat(z)    -> true
+|   nat(s(X)) -> nat(s(X))
+| in
+|   rewrite nat(s(s(z)))
+? Type not well-founded: recursive rewrite does not decrease in [foo.2]
+
+| type nat is
+|   nat(z)    -> true
+|   nat(s(X)) -> nat(s(s(X)))
+| in
+|   rewrite nat(s(s(z)))
+? Type not well-founded: recursive rewrite does not decrease in [foo.2]
+
+| type bad
+|   bad(leaf(X)) -> true
+|   bad(tree(X,Y)) -> and(bad(X),bad(tree(Y,Y))
+| in
+|   rewrite whatever
+? Type not well-founded: recursive rewrite does not decrease in [bad.2]
+
+We can check this by looking at all the rewrite rules in the
+definition of the type that are recursive, i.e. that contain on
+on their r.h.s. the constructor being defined as a type predicate.
+For every such occurrence on the r.h.s. of a recursive rewrite,
+the contents of the constructor must be "smaller" than the contents
+of the constructor on the l.h.s.  What it means to be smaller
+should be fairly obvious: it just has fewer subterms.  If all the
+rules conform to this pattern, rewriting will eventually terminate,
+because it will run out of subterms to rewrite.
+
+Application of Types in Proofs
+------------------------------
+
+Now, aside from these restrictions, type predicates are basically
+rewrite rules, just like any other.  The main difference is that
+we know they are well-defined enough to be used to scope the
+universal quantification in a proof.
+
+Simply having a definition for a `boolean` type allows us to construct
+a simple proof with variables.  Universal quantification over the
+universe of booleans isn't exactly impressive; we don't cover an infinite
+range of values, like we would with integers, or lists.  But it's
+a starting point on which we can build.  We will give some rewrite rules
+for a constructor `not`, and prove that this constructor always reduces
+to a boolean when given a boolean.
+
+| type boolean is
+|   boolean(true)  -> true
+|   boolean(false) -> true
+| in let
+|   not(true)  -> false
+|   not(false) -> true
+|   not(_)     -> undefined
+| in theorem
+|   forall X where boolean(X)
+|     boolean(not(X)) ~> true
+| proof
+|   case X = true
+|     boolean(not(true))
+|     -> boolean(true)   [by not.1]
+|     -> true            [by boolean.1]
+|   case X = false
+|     boolean(not(false))
+|     -> boolean(false)  [by not.2]
+|     -> true            [by boolean.2]
+| qed
+= true
+
+As you can see, proofs using universally quantified variables
+need to make use of _cases_.  We know this proof is sound, because
+it shows the rewrite steps for all the possible values of the
+variable -- and we know they are all the possible values, from the
+definition of the type.
+
+In this instance, the cases are just the two possible values
+of the boolean type, but if the type was defined inductively,
+they would need to cover the base and inductive cases.  In both
+matters, each case in a complete proof maps to exactly one of
+the possible rewrite rules for the type predicate.  (and vice versa)
+
+Let's prove the type of a slightly more complex rewrite rule,
+one which has multiple subterms which can vary.  (This `and`
+constructor has already been introduced, and we've claimed we
+can use it in the definition of well-founded inductive types;
+but this code proves that it is indeed well-founded, and it
+doesn't rely on it already being defined.)
+
+| let
+|   and(true,true) -> true
+|   and(_,_)       -> false
+| in theorem
+|   forall X where boolean(X)
+|     forall Y where boolean(Y)
+|       boolean(and(X,Y)) ~> true
+| proof
+|   case X = true
+|     case Y = true
+|       boolean(and(true,true))
+|       -> boolean(true)        [by and.1]
+|       -> true                 [by boolean.1]
+|     case Y = false
+|       boolean(and(true,false))
+|       -> boolean(false)       [by and.2]
+|       -> true                 [by boolean.2]
+|   case X = false
+|     case Y = true
+|       boolean(and(false,true))
+|       -> boolean(false)       [by and.2]
+|       -> true                 [by boolean.2]
+|     case Y = false
+|       boolean(and(false,false))
+|       -> boolean(false)       [by and.2]
+|       -> true                 [by boolean.2]
+| qed
+= true
+
+Unwieldy, you say!  And you are correct.  But making something
+easy to use was never my goal.
+
+Note that the definition of `and()` is a bit more open-ended than
+`not()`.  `and.2` allows terms like `and(dog,cat)` to rewrite to `false`.
+But our proof only shows that the result of reducing `and(A,B)` is
+a boolean *when both A and B are booleans*.  So it, in fact,
+tells us nothing about the type of `and(dog,cat)`, nor in fact anything
+at all about the properties of `and(A,B)` when one or more of `A` and
+`B` are not of boolean type.  So be it.
+
+Anyway, since we were speaking of inductively defined types
+previously, let's do that now.  With the help of `and()`, here is
+a type for binary trees.
+
+| type tree is
+|   tree(leaf)        -> true
+|   tree(branch(X,Y)) -> and(tree(X),tree(Y))
+| in
+|   rewrite tree(branch(leaf,leaf))
+= true
+
+We can define some rewrite rules on trees.  To start small,
+let's define a simple predicate on trees.
+
+| type tree is
+|   tree(leaf)        -> true
+|   tree(branch(X,Y)) -> and(tree(X),tree(Y))
+| in let
+|   empty(leaf)        -> true
+|   empty(branch(_,_)) -> false
+| in empty(branch(branch(leaf,leaf),leaf))
+= false
+
+| type tree is
+|   tree(leaf)        -> true
+|   tree(branch(X,Y)) -> and(tree(X),tree(Y))
+| in let
+|   empty(leaf)        -> true
+|   empty(branch(_,_)) -> false
+| in empty(leaf)
+= true
+
+Now let's prove that our predicate always rewrites to a boolean
+(i.e. that it has boolean type) when its argument is a tree.
+
+| type tree is
+|   tree(leaf)        -> true
+|   tree(branch(X,Y)) -> and(tree(X),tree(Y))
+| in let
+|   empty(leaf)        -> true
+|   empty(branch(_,_)) -> false
+| in theorem
+|   forall X where tree(X)
+|     boolean(empty(X)) ~> true
+| proof
+|   case X = leaf
+|     boolean(empty(leaf))
+|     -> boolean(true)  [by empty.1]
+|     -> true           [by boolean.1]
+|   case X = branch(S,T)
+|     boolean(empty(branch(S,T)))
+|     -> boolean(false) [by empty.2]
+|     -> true           [by boolean.2]
+| qed
+= true
+
+This isn't really a proof by induction yet, but it's getting closer.
+This is still really us examining the cases to determine the type.
+But, we have an extra guarantee here; in `case X = branch(S,T)`, we
+know `tree(S) -> true`, and `tree(T) -> true`, because `tree(X) -> true`.
+This is one more reason why `and(X,Y)` is built-in to Madison; it
+needs to leverage what it means and make use of this information in a
+proof.  We don't really use that extra information in this proof, but
+we will later on.
+
+Structural Induction
+--------------------
+
+Let's try something stronger, and get into something that could be
+described as real structural induction.  This time, we won't just prove
+something's type.  We'll prove something that actually walks and talks
+like a real (albeit simple) theorem: the reflection of the reflection
+of any binary tree is the same as the original tree.
+
+| type tree is
+|   tree(leaf)        -> true
+|   tree(branch(X,Y)) -> and(tree(X),tree(Y))
+| in let
+|   reflect(leaf)        -> leaf
+|   reflect(branch(A,B)) -> branch(reflect(B),reflect(A))
+| in theorem
+|   forall X where tree(X)
+|     reflect(reflect(X)) ~> X
+| proof
+|   case X = leaf
+|     reflect(reflect(leaf))
+|     -> reflect(leaf)        [by reflect.1]
+|     -> leaf                 [by reflect.1]
+|   case X = branch(S, T)
+|     reflect(reflect(branch(S, T)))
+|     -> reflect(branch(reflect(T),reflect(S)))          [by reflect.2]
+|     -> branch(reflect(reflect(S)),reflect(reflect(T))) [by reflect.2]
+|     -> branch(S,reflect(reflect(T)))                   [by IH]
+|     -> branch(S,T)                                     [by IH]
+| qed
+= true
+
+Finally, this is a proof using induction!  In the [by IH] clauses,
+IH stands for "inductive hypothesis", the hypothesis that we may
+assume in making the proof; namely, that the property holds for
+"smaller" instances of the type of X -- in this case, the "smaller"
+trees S and T that are used to construct the tree `branch(S, T)`.
+
+Relying on the IH is valid only after we have proved the base case.
+After having proved `reflect(reflect(S)) -> S` for the base cases of
+the type of S, we are free to assume that `reflect(reflect(S)) -> S`
+in the induction cases.  And we do so, to rewrite the last two steps.
+
+Like cases, the induction in a proof maps directly to the
+induction in the definition of the type of the variable being
+universally quantified upon.  If the induction in the type is well-
+founded, so too will be the induction in the proof.  (Indeed, the
+relationship between induction and cases is implicit in the
+concepts of the "base case" and "inductive case (or step)".)
+
+Stepping Back
+-------------
+
+So, we have given a simple term-rewriting-based language for proofs,
+and shown that it can handle a proof of a property over an infinite
+universe of things (trees.)  That was basically my goal in designing
+this language.  Now let's step back and consider some of the
+implications of this system.
+
+We have, here, a typed programming language.  We can define types
+that look an awful lot like algebraic data types.  But instead of
+glibly declaring the type of any given term, like we would in most
+functional languages, we actually have to *prove* that our terms
+always rewrite to a value of that type.  That's more work, of
+course, but it's also stronger: in proving that the term always
+rewrites to a value of the type, we have, naturally, proved that
+it *always* rewrites -- that its rewrite sequence is terminating.
+There is no possibility that its rewrite sequence will enter an
+infinite loop.  Often, we establish this with the help of previously
+established basis that our inductively-defined types are well-founded,
+which is itself proved on the basis that the subterm relationship is
+well-founded.
+
+Much like we can prove termination in course of proving a type,
+we can prove a type in the course of proving a property -- such
+as the type of `reflect(reflect(T))` above.  (This does not directly
+lead to a proof of the type of `reflect`, but whatever.)
+
+And, of course, we are only proving the type of term on the
+assumption that its subterms have specific types.  These proofs
+say nothing about the other cases.  This may provide flexibility
+for extending rewrite systems -- or it might not, I'm not sure.
+It might be nice to prove that all other types result in some
+error term.  (One of the more annoying things about term-rewriting
+languages is how an error can result in a half-rewritten program
+instead of a recgonizable error code.  There seems to be a tradeoff
+between extensibility and producing recognizable errors.)
+
+Grammar so Far
+--------------
+
+I think I've described everything I want in the language above, so
+the grammar should, modulo tweaks, look something like this:
+
+    Madison      ::= Block.
+    Block        ::= LetBlock | TypeBlock | ProofBlock | RewriteBlock.
+    LetBlock     ::= "let" {Rule} "in" Block.
+    TypeBlock    ::= "type" Symbol "is" {Rule} "in" Block.
+    RewriteBlock ::= "rewrite" Term.
+    Rule         ::= Term "->" Term.
+    Term         ::= Atom | Constructor | Variable.
+    Atom         ::= Symbol.
+    Constructor  ::= Symbol "(" Term {"," Term} ")".
+    ProofBlock   ::= "theorem" Statement "proof" Proof "qed".
+    Statement    ::= Quantifier Statement | MultiStep.
+    Quantifiers  ::= "forall" Variable "where" Term.
+    MultiStep    ::= Term "~>" Term.
+    Proof        ::= Case Proof {Case Proof} | Trace.
+    Trace        ::= Term {"->" Term [RuleRef]}.
+    RuleRef      ::= "[" "by" (Symbol "." Integer | "IH") "]".
+
+Discussion
+----------
+
+I think that basically covers it.  This document is still a little
+rough, but that's what major version zeroes are for, right?
+
+I have essentially convinced myself that the above-described system
+is sufficient for simple proof checking.  There are three significant
+things I had to convince myself of to get to this point, which I'll
+describe here.
+
+One is that types have to be well-founded in order for them to serve
+as scopes for universal quantification.  This is obvious in
+retrospect, but getting them into the language in a way where it was
+clear they could be checked for well-foundedness took a little
+effort.  The demarcation of type-predicate rewrite rules was a big
+step, and a little disappointing because it introduces the loaded
+term `type` into Madison's vernacular, which I wanted to avoid.
+But it made it much easier to think about, and to formulate the
+rules for checking that a type is well-founded.  As I mentioned, it
+could go away -- Madison could just as easily check that any
+constructor used to scope a universal quantification is well-founded.
+But that would probably muddy the presentation of the idea in this
+document somewhat.  It would be something to keep in mind for a
+subsequent version of Madison that further distances itself from the
+notion of "types".
+
+Also, it would probably be possible to extend the notion of well-
+founded rewriting rules to mutually-recursive rewriting rules.
+However, this would complicate the procedure for checking that a
+type predicate is well-founded.
+
+The second thing I had to accept to get to this conviction is that
+`and(X,Y)` is built into the language.  It can't just be defined
+in Madison code, because while this would be wonderful from a
+standpoint of minimalism, Madison has to know what it means to let
+you write non-trivial inductive proofs.  In a nutshell, it has to
+know that `foo(X) -> and(bar(X),baz(X))` means that if `foo(X)` is
+true, then `bar(X)` is also true, and `baz(X)` is true as well.
+
+I considered making `or(X,Y)` a built-in as well, but after some
+thought, wasn't convinced that it was that valuable in the kinds
+of proofs I wanted to write.
+
+Lastly, the third thing I had to come to terms with was, in general,
+how we know a stated proof is complete.  As I've tried to describe
+above, we know it's complete because each of the cases maps to a
+possible rewrite rule, and induction maps to the inductive definition
+of a type predicate, which we know is well-founded because of the
+checks Madison does on it (ultimately based on the assumption that
+the subterm property is well-founded.)  There Is Nothing Else.
+
+This gets a little more complicated when you get into proofs by
+induction.  The thing there is that we can assume the property
+we want to prove, in one of the cases (the inductive case) of the
+proof, so long as we have already proved all the other cases (the
+base case.)  This is perfectly sound in proofs by hand, so it is
+likewise perfectly sound in a formal proof checker like Madison;
+the question is how Madison "knows" that it is sound, i.e. how it
+can be programmed to reject proofs which are not structured this
+way.  Well, if we limit it to what I've just described above --
+check that the scope of the universal quantification is well-
+founded, check that there are two cases, and check that we've already
+proved one case, then allow the inductive hypothesis to be used as a
+rewrite rule in the other case of the proof -- this is not difficult
+to see how it could be mechanized.
+
+However, this is also very limited.  Let's talk about limitations.
+
+For real data structures, you might well have multiple base cases;
+for example, a tree with two kinds of leaf nodes.  Does this start
+breaking down?  Probably.  It probably breaks down with multiple
+inductive cases, as well, although you might be able to get around
+that with breaking the proof into multiple proofs, and having
+subsequent proofs rely on properties proved in previous proofs.
+
+Another limitation I discovered when trying to write a proof that
+addition in Peano arithmetic is commutative.  It seemingly can't
+be done in Madison as it currently stands, as Madison only knows
+how to rewrite something into something else, and cannot express
+the fact that two things (like `add(A,B)` and `add(B,A)`) rewrite
+to the same thing.  Such a facility would be easy enough to add,
+and may appear in a future version of Madison, possibly with a
+syntax like:
+
+    theorem
+      forall A where nat(A)
+        forall B where nat(B)
+          add(A,B) ~=~ add(B,A)
+    proof ...
+
+You would then show that `add(A,B)` reduces to something, and
+that `add(B,A)` reduces to something, and Madison would check
+that the two somethings are in fact the same thing.  This is, in
+fact, a fairly standard method in the world of term rewriting.
+
+As a historical note, Madison is one of the pieces of fallout from
+the overly-ambitious project I started a year and a half ago called
+Rho.  Rho was a homoiconic rewriting language with several very
+general capabilities, and it wasn't long before I decided it was
+possible to write proofs in it, as well as the other things it was
+designed for.  Of course, this stretched it to about the limit of
+what I could keep track of in a single project, and it was soon
+afterwards abandoned.  Other fallout from Rho made it into other
+projects of mine, including Pail (having `let` bindings within
+the names of other `let` bindings), Falderal (the test suite from
+the Rho implementation), and Q-expressions (a variant of
+S-expressions, with better quoting capabilities, still forthcoming.)
+
+Happy proof-checking!  
+Chris Pressey  
+December 2, 2011  
+Evanston, Illinois  

mdpn/doc/mdpn.html

+<html>
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+  <meta name="Description" content="Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation">
+  <title>Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation</title>
+  <!-- begin html doc dynamic markup -->
+  <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
+  <script type="text/javascript" src="/scripts/documentation.js"></script>
+  <!-- end html doc dynamic markup -->
+</head>
+<body>
+
+<h1>Multi-Directional Pattern Notation</h1>
+<p>Final - Sep 6 1999</p>
+<hr>
+
+<h2>Introduction</h2>
+
+<p>MDPN is an extension to EBNF, which attributes it for
+the purposes of scanning and parsing input files which
+assume a non-unidirectional form.  A familiarity with
+EBNF is assumed for the remainder of this document.</p>
+
+<p>MDPN was developed by Chris Pressey in late 1998,
+built on an earlier, less successful attempt at a "2D EBNF" devised
+to fill a void that the 
+mainstream literature on parsing seemed to rarely if
+ever approach, with much help provided by John Colagioia
+throughout 1998.</p>
+
+<p>MDPN has possible uses in the construction of
+parsers and subsequently compilers for multi-directional and
+multi-dimensional languages such as Orthogonal, Befunge,
+Wierd, Blank, Plankalk&uuml;l, and even less contrived notations
+like structured Flowchart and Object models of systems.</p>
+
+<p>As the name indicates, MDPN provides a notation for
+describing multidimensional patterns by
+extending the concept of linear scanning and matching 
+with geometric attributes in a given number of dimensions.</p>
+
+<h2>Preconditions for Multidirectional Parsing</h2>
+
+<p>The multidirectional parsing that MDPN concerns itself
+with assumes that any portion of the input file is accessable
+at any time.  Concepts such as LL(1) are fairly meaningless
+in a non-unidirectional parsing system of this sort.
+The unidirectional input devices such as paper tape and
+punch cards that were the concern of original parsing methods
+have been superceded
+by modern devices such as hard disk drives and ample, cheap RAM.</p>
+
+<p>In addition, MDPN is limited to an orthogonal
+representation of the input file, and this
+document is generally less concerned about forms of
+four or higher dimensions, to reduce unnecessary complexity.</p>
+
+<h2>Notation from EBNF</h2>
+
+<p>Syntax is drawn from EBNF.  It is slightly modified, but
+should not surprise anyone who is familiar with EBNF.</p>
+
+<p>A freely-chosen unadorned ('bareword') alphabetic
+multicharacter identifier indicates
+the name of a nonterminal (named pattern) in the
+grammar.  e.g. <code>foo</code>.  (Single characters have special
+meaning as operators.)  Case matters: <code>foo</code> is not the
+same name as <code>Foo</code> or <code>FOO</code>.</p>
+
+<p>Double quotes begin and end literal terminals (symbols.)
+e.g. <code>"bar"</code>.</p>
+
+<p>A double-colon-equals-sign (<code>::=</code>)
+describes a production (pattern match)
+by associating a single nonterminal on the left with
+a pattern on the right, terminated with a period.
+  e.g. <code>foo ::= "bar".</code></p>
+
+<p>A pattern is a series of terminals, nonterminals,
+operators, and parenthetics.</p>
+
+<p>The <code>|</code> operator denotes
+alternatives.  e.g. <code>"foo" | "bar"</code></p>
+
+<p>The <code>(</code> and <code>)</code> parentheses denote
+precedence and grouping.</p>
+
+<p>The <code>[</code> and <code>]</code> brackets denote
+that the contents may be omitted, that is, they
+may occur zero or one times.  e.g. <code>"bar" ["baz"]</code></p>
+
+<p>The <code>{</code> and <code>}</code> braces denote
+that the contents may be omitted or may be repeated any number of times.
+e.g. <code>"bar" {baz "quuz"}</code></p>
+
+<h2>Deviations from EBNF</h2>
+
+<p>The input file is spatially related to a
+coordinate system and it is useful to
+think of the input mapped to an orthogonally
+distributed (Cartesian) form with no arbitrary
+limit imposed on its size, hereinafter referred to
+as <i>scan-space</i>.</p>
+
+<p>The input file is mapped to scan-space.  The
+first printable character in the input file always
+maps to the <i>origin</i> of scan-space regardless of
+the number of dimensions.  The origin is enumerated
+with coordinates (0) in one dimension, (0,0) in two
+dimensions, (0,0,0) in three dimensions, etc.</p>
+
+<p>Scan-space follows the 'computer storage' co-ordinate
+system so that <i>x</i> coordinates increase to
+the 'east' (rightwards), <i>y</i> coordinates increase to the
+'south' (downwards), and <i>z</i> coordinates increase on each
+successive 'page'.</p>
+
+<p>Successive characters in the input file indicate
+successive coordinate (<i>x</i>) values in scan-space.
+For two and three dimensions, end-of-line markers are assumed
+to indicate "reset the <i>x</i> dimension and increment
+the <i>y</i> dimension", and end-of-page markers
+indicate "reset the <i>y</i> dimension and increment
+the <i>z</i> dimension",
+thus following the
+commonplace mapping of computer text printouts.</p>
+
+<p>Whitespace in the input file are <b>not</b> ignored.
+The terminal <code>" "</code>, however, will match any
+whitespace (including tabs, which are <b>not</b> expanded.)
+The pattern <code>{" "}</code> may be used to indicate
+any number of whitespaces; <code>" " {" "}</code> may be used to
+indicate one or more whitespaces.  Areas of scan-space
+beyond the bounds of the input file are considered to be filled
+with whitespaces.</p>
+
+<p>Therefore, <code>"hello"</code> as a terminal is exactly
+the same as <code>"h" "e" "l" "l" "o"</code> as an pattern of
+terminals.</p>
+
+<p>A <code>}</code> closing brace can be followed by a <code>^</code>
+(<i>constraint</i>)
+operator, which is followed by an expression in parentheses.</p>
+
+<ul>
+<p>This expression is actually in a subnotation which supports
+a very simple form of algebra.  The expression (built with
+terms connected by infix <code>+-*/%</code> operators with their
+C language meanings) can either reduce to
+<ul>
+<li>a constant value, as in <code>{"X"} ^ (5)</code>, which
+would match five <code>X</code> terminals in a line; or
+<li>an unknown value, which can involve any single lowercase
+letters, which indicate variables local to the production,
+as in <code>{"+"}^(x) {"-"}^(x*2)</code>, which would match only twice
+as many minus signs as plus signs.
+</ul></p>
+
+<p>Complex algebraic expressions in constraints can and probably
+should be avoided when constructing a MDPN grammar
+for a real (non-contrived) compiler.  MDPN-based
+compiler-compilers aren't expected to support more than
+one or two unknowns per expression, for example.
+There is no such restriction, of course,
+when using MDPN as a guide for hand-coding a multidimensional
+parser, or otherwise using it as a more 
+sophisticated pattern-matching tool.</p>
+</ul>
+
+<h2>The Scan Pointer</h2>
+
+<p>It is useful to imagine a <i>scan pointer</i> (SP,
+not to be confused with a <i>stack pointer</i>, which is not
+the concern of this document) which is analogous to the
+current token in a single-dimensional parser, but exists
+in MDPN as a free spatial relationship to the input file, and
+thus also has associated geometric attributes such as
+direction.</p>
+
+<p>The SP's <i>location</i> is advanced through scan-space by its
+<i>heading</i> as terminals in the productions
+are successfully matched with symbols in the input buffer.</p>
+
+<p>The following geometric attribution operators modify the
+properties of the SP.  Note that injudicious
+use of any of these operators <i>can</i> result in an infinite loop
+during scanning.  There is no built-in contingency measure to
+escape from an infinite parsing loop in MDPN (but see
+exclusivity, below, for a possible way to overcome this.)</p>
+
+<p><code>t</code> is the relative translation operator.  It is
+followed by a vector, in parentheses, which is added to the
+location of the SP.  This does not change its heading.</p>
+
+<p>For example, <code>t (0,-1)</code> moves the
+SP one symbol above the current symbol (the symbol which was
+<i>about</i> to be matched.)</p>
+
+<p>As a more realistic example of how this works, consider that the
+pattern <code>"." t(-1,1) "!" t(0,-1)</code> will match a
+period with an exclamation point directly below it, like:
+
+<pre>.
+!</pre>
+</p>
+
+<p><code>r</code> is the relative rotation operator.  It is
+followed by an axis identifier (optional: see below)
+and an orthogonal angle (an angle <i>a</i>
+such that |<i>a</i>| <b>mod</b> 90 degrees = 0) assumed to
+be measured in degrees, both in parentheses.
+The angle is added to the SP's heading.
+Negative angle arguments are allowed.</p>
+
+<p>Described in two dimensions, the (default) heading 0 denotes 'east,' that
+is, parsing character by character in a rightward direction, where
+the SP's <i>x</i> axis coordinate increases and all other axes coordinates
+stay the same. Increasing angles ascend counterclockwise (90 = 'north',
+180 = 'west', 270 = 'south'.)</p>
+
+<p>For example, <code>">" r(-90) "+^"</code> would match
+
+<pre>&gt;+
+ ^</pre>
+</p>
+
+<p>The axis identifier indicates which axis this rotation
+occurs around.  If the axis identifier is omitted,
+the <i>z</i> axis is to be assumed, since this is certainly the
+most common axis to rotate about, in two dimensions.</p>
+
+<p>If the axis identifier is present, it may be a single letter
+in the set <code>xyz</code> (these unsurprisingly indicate
+the <i>x</i>, <i>y</i>, and <i>z</i> dimensions respectively),
+or it may be a non-negative integer value, where 0 corresponds to the
+<i>x</i> dimension, 1 corresponds to the <i>y</i> dimension, etc.
+(Implementation note: in more than two dimensions,
+the SP's heading property should probably be broken up
+internally into theta, rho, &amp;c components as appropriate.)</p>
+
+<p>For example, <code>r(z,180)</code> rotates the SP's heading
+180 degrees about the <i>z</i> (dimension #2) axis, as does <code>r(2,180)</code>
+or even just <code>r(180)</code>.</p>
+
+<p><code>&lt;</code> and <code>&gt;</code> are the push
+and pop state-stack operators, respectively.  Alternately,
+they can be viewed as lookahead-assertion parenthetics, since the stack
+is generally assumed to be
+local to the production.  (Compiler-compilers should probably
+notify the user, but not necessarily panic, if they find unbalanced 
+<code>&lt;&gt;</code>'s.)</p>
+
+<p>All properties of the SP (including location and heading,
+and scale factor if supported) are pushed as a group
+onto the stack during <code>&lt;</code> and popped as a group
+off the stack during <code>&gt;</code>.</p>
+
+<h2>Advanced SP Features</h2>
+
+<p>These features are not absolutely necessary for most
+non-contrived multi-directional grammars.  MDPN compiler-compilers
+are not expected to support them.</p>
+
+<p><code>T</code> is the absolute translation operator.  It is
+followed by a vector which is assigned to the location of the SP.
+ e.g. <code>T (0,0)</code> will 'home' the scan.</p>
+
+<p><code>R</code> is the absolute rotation operator.  It is
+followed by an optional axis identifier, and an
+orthogonal angle assumed to be measured in degrees.
+The SP's heading is set to this
+angle.  e.g. <code>R(270)</code> sets the
+SP scanning line after line down the input text, downwards.  See
+the <code>r</code> operator, above, for how the axis identifier functions.</p>
+
+<p><code>S</code> is the absolute scale operator.  It is
+followed by an orthogonal <i>scaling factor</i> (a scalar <i>s</i>
+such that <i>s</i> = <b>int</b>(<i>s</i>) and <i>s</i> >= 1).
+The SP's scale factor is set to this value.  The finest
+possible scale, 1, indicates a 1:1 map with the input file;
+for each one input symbol matched, the SP advances one symbol
+in its path.  When the scale factor is two, then for each
+one input symbol matched, the SP advances two symbols, skipping
+over an interim symbol.  Etc.</p>
+
+<p><code>s</code> is the relative scale operator.  It is
+followed by a scalar integer which is added to the SP's scaling factor
+(so long as it does not cause the scaling factor to be zero or negative.)</p>
+
+<p>Scale operators may also take an optional axis identifier
+(as in <code>S(y,2)</code>), but when the axis identifier is omitted,
+all axes are assumed (non-distortional scaling).</p>
+
+<p><code>!&gt;</code> is a state-assertion alternative to
+<code>&gt;</code>, for the purpose of determining that
+the SP successfully and completely
+reverted to a previous state that
+was pushed onto the stack ('came full circle').  This
+operator is something of a luxury; a
+grammar which uses constraints correctly should
+never <i>need</i> it, but it can come in handy.</p>
+
+<h2>Other Advanced Features: Exclusivity</h2>
+
+<p>Lastly, in the specification of a production,
+the <i>exclusivity</i> applying to that production can be given
+between a hyphen following
+the name of the nonterminal, and the <code>::=</code> operator.</p>
+
+<p>Exclusivity is a list of productions, named by their
+nonterminals, and comes into play at any particular <i>instance</i>
+of the production (i.e. when the production successfully
+matches specific symbols at specific points in scan-space during a parse,
+called the <i>domain</i>.)  The exclusivity describes how the domain of each
+instance is protected from being the domain of any further instances.
+The domain of any subsequent instances of any productions listed
+in the exclusivity is restricted from sharing points in scan-space
+with the established domain.</p>
+
+<p>Exclusivity is a measure to prevent so-called <i>crossword
+grammars</i> - that is, where instances of productions
+can <i>overlap</i> and share common symbols - if desired.
+Internally it's generally considered a list of
+'used-by-this-production' references
+associated with each point in scan-space.
+An example of the syntax to specify exclusivity is
+<code>bar - bar quuz ::= foo {"a"} baz</code>.
+Note that the domain of an instance of <code>bar</code> is the
+sum of the
+domains <code>foo</code>, <code>baz</code> and the chain of "<code>a</code>" terminals,
+and that neither a subsequent instance of <code>quuz</code> nor <code>bar</code>
+again can overlap it.</p>
+
+<h2>Examples of MDPN-described Grammars</h2>
+
+<p><b>Example 1.</b>  A grammar for describing boxes.</p>
+
+<p>The task of writing a translator to recognize a
+two-dimensional construct such as a box can easily
+be realized using a tool such as MDPN.</p>
+
+<p>An input file might contain a of box drawn in
+ASCII characters, such as
+
+<pre>+------+
+|      |
+|      |
++------+</pre>
+</p>
+
+<p>Let's also say that boxes have a minimum height of four (they
+must contain at least two rows), but
+no minimum width.  Also, edge characters must match up with
+which edge they are on.  So, the following forms are both
+illegal inputs:
+
+<pre>+-+
++-+</pre>
+
+<pre>+-|-+
+|   |
+*    
+|   |
++-|-+</pre>
+</p>
+
+<p>The MDPN production used to describe this box might be
+
+<pre>  Box ::= "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90)
+          "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90).</pre>
+</p>
+
+<p><b>Example 2.</b>  A simplified grammar for Plankalk&uuml;l's assignments.</p>
+
+<p>An input file might contain an ASCII approximation of something
+Zuse might have jotted down on paper:
+
+<pre> |Z   + Z   => Z
+V|1     2      3
+S|1.n   1.n    1.n</pre>
+</p>
+
+<p>Simplified MDPN productions used to describe this might be
+
+<pre>
+Staff     ::= Spaces "|" TempVar AddOp TempVar Assign TempVar.<br>
+TempVar   ::= "Z" t(-1,1) Index t(-1,1) Structure t(0,-2) Spaces.
+Index     ::= &lt;Digit {Digit}&gt;.
+Digit     ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9".
+Structure ::= &lt;Digit ".n"&gt;.
+AddOp     ::= ("+" | "-") Spaces.
+Assign    ::= "=>" Spaces.
+Spaces    ::= {" "}.
+</pre>
+</p>
+
+<hr>
+<i>Last Updated Mar 2&nbsp;<img src="/images/icons/copyright.gif"
+   align=absmiddle width=12 height=12 alt="(c)" border=0>2004 <a
+   href="/">Cat's Eye Technologies</a>.
+All rights reserved.
+This document may be freely redistributed in its unchanged form.</i>
+</body></html>

oozlybub-and-murphy/doc/oozlybub-and-murphy.html

+<?xml version="1.0"?>
+<!-- encoding: utf-8 -->
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+<title>The Oozlybub and Murphy Programming Language</title>
+  <!-- begin html doc dynamic markup -->
+  <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
+  <script type="text/javascript" src="/scripts/documentation.js"></script>
+  <!-- end html doc dynamic markup -->
+</head>
+<body>
+
+<h1>Oozlybub and Murphy</h1>
+
+<p>Language version 1.1</p>
+
+<h2>Overview</h2>
+
+<p>This document describes a new programming language.
+The name of this language is <dfn>Oozlybub and Murphy</dfn>.
+Despite appearances, this name refers to a single language.
+The majority of the language is
+named Oozlybub.  The fact that the language is not entirely named
+Oozlybub is named Murphy.  Deal with it.</p>
+
+<p>For the sake of providing an "olde tyme esoterickal de-sign", the
+language combines several unusual features, including multiple interleaved parse
+streams, infinitely long variable names, gratuitously strong typing, and
+only-conjectural Turing completeness.
+While no implementation of the language exists as of this writing, it is thought to
+be sufficiently consistent to be implementable, modulo any errors in this docunemt.</p>
+
+<p>In places the language may resemble <a href="/projects/smith/">SMITH</a>
+and <a href="/projects/quylthulg/">Quylthulg</a>, but this was not intended,
+and the similarities are purely emergent.</p>
+
+<h2>Program Structure</h2>
+
+<p>A Oozlybub and Murphy program consists of a number of <dfn>variables</dfn> and
+a number of objects called <dfn>dynasts</dfn>.  A Oozlybub and Murphy program text
+consists of multiple <dfn>parse streams</dfn>.  Each parse stream contains
+zero or more variable declarations, and optionally a single dynast.</p>
+
+<h3>Parse Streams</h3>
+
+<p>A parse stream is just a segment, possibly non-contiguous, of the text of
+a Oozlybub and Murphy program.  A program starts out with a single parse stream, but
+certain <dfn>parse stream manipulation pragmas</dfn> can change this.
+These pragmas have the form <code>{@<var>x</var>}</code> and
+have a similar syntactic status as comments; they can appear anywhere except
+inside a lexeme.</p>
+
+<p>Parse streams are arranged as a ring (a cyclic doubly linked list.)  When parsing
+of the program text begins initially, there is already a single pre-created parse stream.
+When the program text ends, all parse streams which may be active are deleted.</p>
+
+<p>The meanings of the pragmas are:</p>
+
+<ul>
+<li><code>{@+}</code> Create a new parse stream to the right of the current one.</li>
+<li><code>{@&gt;}</code> Switch to the parse stream to the right of the current one.</li>
+<li><code>{@&lt;}</code> Switch to the parse stream to the left of the current one.</li>
+<li><code>{@-}</code> Delete the current parse stream.  The parse stream to the
+left of the deleted parse stream will become the new current parse stream.</li>
+</ul>
+
+<p>Deleting a parse stream while it contains an unfinished syntactic construct is a syntax
+error, just as an end-of-file in that circumstance would be in most other languages.</p>
+
+<p>Providing a concrete example of parse streams in action will be difficult in the
+absence of defined syntax for the rest of Oozlybub and Murphy, so we will, for the purposes of
+the following demonstration only, pretend that the contents of a parse stream is a sentence of English.
+Here is how three parse streams might be managed:</p>
+
+<p><code>The quick {@+}brown{@&gt;}Now is the time{@&lt;}fox{@&lt;} for all good
+men to {@+}{@&gt;}Wherefore art thou {@&gt;} jumped over {@&gt;}{@&gt;}Romeo?{@-}
+come to the aid of {@&gt;}the lazy dog's tail.{@-}their country.{@-}
+</code></p>
+
+<h3>Variables</h3>
+
+<p>All variables are declared in a block at the beginning of a parse stream.
+If there is also a dynast in that stream, the variables are private to that dynast;
+otherwise they are global and shared by all dynasts.  (<em>Defined in 1.1</em>)
+Any dynamically created dynast gets its own private copies of any private
+variables the original dynast had; they will initially hold the values they
+had in the original, but they are not shared.</p>
+
+<p>The name of a variable in Oozlybub and Murphy is not a fixed, finite-length string of symbols,
+as you would find in other programming languages.  No sir!  In Oozlybub and Murphy, each
+variable is named by a possibly-infinite set of strings
+(over the alphanumeric-plus-spaces alphabet <code>[a-zA-Z0-9 ]</code>),
+at least one of which must be infinitely long.  (<em>New in 1.1</em>: spaces
+[but no other kinds of whitespace] are allowed in these strings.)</p>
+
+<p>To accomodate this method of identifying a variable, in Oozlybub and Murphy programs,
+which are finite, variables are identified using regular expressions
+which match their set of names.
+An equivalence class of regular expressions is a set of all regular expressions which
+accept exactly the same set of strings; each equivalence class of regular expressions
+refers to the same, unique Oozlybub and Murphy variable.</p>
+
+<p>(In case you wonder about the implementability of this:
+Checking that two regular expressions are
+equivalent is decidable: we convert them both to NFAs, then
+to DFAs, then minimize those DFAs, then check if the transition
+graphs of those DFAs are isomorphic.
+Checking that the regular expression accepts at least
+one infinitely-long string is also decidable: just look for
+a cycle in the DFA's graph.)</p>
+
+<p>Note that these identifier-sets need not be disjoint.  <code>/ma*/</code>
+and <code>/mb*/</code> are distinct variables, even though both contain the
+string <code>m</code>.  (Note also that we are fudging slightly on how we
+consider to have described an infinitely long name; technically we would
+want to have a Büchi automaton that specifies an unending repetition with
+<sup>ω</sup> instead of *.  But the distinction is subtle enough in this
+context that we're gonna let it slide.)</p>
+
+<p>Syntax for giving a variable name is fairly straightforward: it is delimited on
+either side by <code>/</code> symbols; the alphanumeric symbols are literals;
+textual concatenation is regular expression sequencing,
+<code>|</code> is alteration, <code>(</code> and <code>)</code> increase precedence,
+and  <code>*</code> is Kleene asteration (zero or more occurrences).
+Asteration has higher precedence than sequencing, which has higher precedence
+than alteration.  Because none of these operators is alphanumeric nor a space, no
+escaping scheme needs to be installed.</p>
+
+<p>Variables are declared with the following syntax (<code>i</code> and <code>a</code>
+are the types of the variables, described in the next section):</p>
+
+<pre>
+VARIABLES ARE i /pp*/, i /qq*/, a /(0|1)*/.
+</pre>
+
+<p>This declares an integer variable identified by the names {<code>p</code>, <code>pp</code>, <code>ppp</code>, ...},
+an integer variable identified by the names {<code>q</code>, <code>qq</code>, <code>qqq</code>, ...},
+and an array variable identified by the names of all strings of <code>0</code>'s and <code>1</code>'s.</p>
+
+<p>When not in wimpmode (see below), any regular expression which denotes a
+variable may not be literally repeated anywhere else in the program.  So in the above example,
+it would not be legal to refer to <code>/pp*/</code> further down in the program; an equivalent regular
+expression, such as <code>/p|ppp*/</code> or <code>/p*p/</code> or
+<code>/pp*|pp*|pp*/</code> would have to be used instead.</p>
+
+<h3>Types</h3>
+
+<p>Oozlybub and Murphy is a statically-typed language, in that variables
+as well as values have types, and a value of one type cannot be stored in
+a variable of another type.  The types of values, however, are not entirely
+disjoint, as we will see, and special considerations may arise for checking
+and conversion because of this.</p>
+
+<p>The basic types are:</p>
+
+<ul>
+<li><code>i</code>, the type of integers.
+<p>These are integers of unbounded extent, both positive and negative.
+Literal constants of type <code>i</code> are given in the usual decimal format.
+Variables of this type initially contain the value 0.</p>
+</li>
+
+<li><code>p</code>, the type of prime numbers.
+<p>All prime numbers are integers but not all integers are prime numbers.
+Thus, values of prime number type will automatically be coerced to
+integers in contexts that require integers; however the reverse is not true,
+and in the other direction a conversion function (<code>P?</code>) must
+be used.  There are no literal constants of type <code>p</code>.  Variables
+of this type initially contain the value 2.</p>
+</li>
+
+<li><code>a</code>, the type of arrays of integers.
+<p>An integer array has an integer index which is likewise of unbounded
+extent, both positive and negative.  Variables of this type initially
+contain an empty array value, where all of the entries are 0.</p>
+</li>
+
+<li><code>b</code>, the type of booleans.
+<p>A boolean has two possible values, <code>true</code> and <code>false</code>.
+Note that there are no literal constants of type <code>b</code>; these must be
+specified by constructing a tautology or contradiction with boolean (or other) operators.
+It is illegal to retrieve the value of a variable of this type before first assigning it,
+except to construct a tautology or contradiction.</p>
+</li>
+
+<li><code>t</code>, the type of truth-values.
+<p>A truth-value has two possible values, <code>yes</code> and <code>no</code>.
+There are no literal constants of type <code>t</code>.
+It is illegal to retrieve the value of a variable of this type before first assigning it,
+except to construct a tautology or contradiction.</p>
+</li>
+
+<li><code>z</code>, the type of bits.
+<p>A bit has two possible values, <code>one</code> and <code>zero</code>.
+There are no literal constants of type <code>z</code>.
+It is illegal to retrieve the value of a variable of this type before first assigning it,
+except to construct a tautology or contradiction.</p>
+</li>
+
+<li><code>c</code>, the type of conditions.
+<p>A condition has two possible values, <code>go</code> and <code>nogo</code>.
+There are no literal constants of type <code>c</code>.
+It is illegal to retrieve the value of a variable of this type before first assigning it,
+except to construct a tautology or contradiction.</p>
+</li>
+
+</ul>
+
+<h3>Wimpmode</h3>
+
+<p>(<em>New in 1.1</em>) An Oozlybub and Murphy program is in <dfn>wimpmode</dfn> if it
+declares a global variable of integer type which matches the string <code>am a wimp</code>,
+for example:</p>
+
+<pre>
+VARIABLES ARE i /am *a *wimp/.
+</pre>
+
+<p>Certain language constructs, noted in this document as such,
+are only permissible in wimpmode.
+If they are used in a program in which wimpmode is not in effect, a
+compile-time error shall occur and the program shall not be executed.</p>
+
+<h3>Dynasts</h3>
+
+<p>Each dynast is labeled with a positive integer and contains an expression.
+Only one dynast may be denoted in any given parse stream, but dynasts may also
+be created dynamically during program execution.</p>
+
+<p>Program execution begins at the lowest-numbered dynast that exists in the initial program.
+When a dynast is executed, the expression of that dynast is evaluated for its side-effects.
+If there is a dynast labelled with the next higher integer (i.e. the successor of the label
+of the current dynast),
+execution continues with that dynast; otherwise, the program halts.
+Once a dynast has been executed, it continues to exist until the program
+halts, but it may never be executed again.</p>
+
+<p>Evaluation of an expression may have side-effects, including
+writing characters to an output channel, reading characters from
+an input channel, altering the value of a variable, and creating
+a new dynast.</p>
+
+<p>Dynasts are written with the syntax <code>dynast(<var>label</var>) &lt;-&gt; <var>expr</var></code>.  A concrete example follows:</p>
+
+<pre>
+dynast(100) &lt;-&gt; for each prime /p*/ below 1000 do write (./p*|p/+1.)
+</pre>
+
+<h3>TRIVIA PORTION OF SHOW</h3>
+
+<p>WHO WAS IT FAMOUS MAN THAT SAID THIS?</p>
+
+<ul>
+<li>A) RONALD REAGAN</li>
+<li>B) RONALD REAGAN</li>
+<li>B) RONALD STEWART</li>
+<li>C) RENALDO</li>
+</ul>
+
+<p>contestant enters lightning round now</p>
+
+<h3>Expressions</h3>
+
+<p>In the following, the letter preceding <var>-expr</var> or <var>-var</var> indicates
+the expected type, if any, of that expression or variable.  Where the expressions listed below are infix
+expressions, they are listed from highest to lowest precedence.  Unless noted otherwise,
+subexpressions are evaluated left to right.</p>
+
+<ul>
+
+<li><code>(.<var>expr</var>.)</code>
+<p>Surrounding an expression with <dfn>dotted parens</dfn> gives it that precedence boost
+that's just the thing to have it be evaluated before the expression it's in, but
+there is a catch.  The number of parens in the dotted parens expression must
+match the nesting depth in the following way: if a set of dotted parens is 
+nested within <var>n</var> dotted parens, it must contain fib(<var>n</var>)
+parens, where fib(<var>n</var>) is the <var>n</var>th member of the
+Fibonacci sequence. For example, <code>(.(.0.).)</code> and <code>(.(.((.(((.(((((.0.))))).))).)).).)</code>
+are syntactically well-formed expressions (when not nested in any other dotted paren
+expression), but <code>(.(((.0.))).)</code> and
+<code>(.(.(.0.).).)</code> are not.</p>
+</li>
+
+<li><code><var>var</var></code>
+<p>A variable evaluates to the value it contains at that point in execution.</p>
+</li>
+
+<li><code>0</code>, <code>1</code>, <code>2</code>, <code>3</code>, etc.
+<p>Decimal literals evaluate to the expected value of type <code>i</code>.</p>
+</li>
+
+<li><code>#myself#</code>
+<p>This special nullary token evaluates to the numeric label of the currently
+executing dynast.</p> 
+</li>
+
+<li><code><var>var</var> := <var>expr</var></code>
+<p>Evaluates the <var>expr</var> and stores the result in the specified variable.
+The variable and the expression must have the same type.  Evaluates to whatever
+<var>expr</var> evaluated to.</p>
+</li>
+
+<li><code><var>a-expr</var> [<var>i-expr</var>]</code>
+<p>Evaluates to the <code>i</code> stored at the location in the 
+array given by <var>i-expr</var>.</p>
+</li>
+
+<li><code><var>a-expr</var> [<var>i-expr</var>] := <var>i-expr</var></code>
+<p>Evaluates the second <var>i-expr</var> and stores the result in the location
+in the array given by the first <var>i-expr</var>.  Evaluates to whatever
+the second <var>i-expr</var> evaluated to.</p>
+</li>
+
+<li><code><var>a-expr</var> ? <var>i-expr</var></code>
+<p>Evaluates to <code>go</code> if <code><var>a-expr</var> [<var>i-expr</var>]</code>
+and <code><var>i-expr</var></code> evaluate to the same thing, <code>nogo</code> otherwise.
+The <var>i-expr</var> is only evaluated once.</p>
+</li>
+
+<li><code>minus <var>i-expr</var></code>
+<p>Evaluate to the integer that is zero minus the result of evaluating <var>i-expr</var>.</p>
+</li>
+
+<li><code>write <var>i-expr</var></code>
+<p>Write the Unicode code point whose number is obtained by evaluating <var>i-expr</var>, to
+the standard output channel.  Writing a negative number shall produce one of a number of
+amusing and informative messages which are not defined by this document.</p>
+</li>
+
+<li><code>#read#</code>
+<p>Wait for a Unicode character to become available on the standard
+input channel and evaluate to its integer code point value.</p>
+</li>
+
+<li><code>not? <var>z-expr</var></code>
+<p>Converts a bit value to a boolean value (<code>zero</code> becomes <code>true</code> and <code>one</code> becomes <code>false</code>).</p>
+</li>
+
+<li><code>if? <var>b-expr</var></code>
+<p>Converts a boolean value to condition value (true becomes go and false becomes nogo).</p>
+</li>
+
+<li><code>cvt? <var>c-expr</var></code>
+<p>Converts a condition value to a truth-value (<code>go</code> becomes <code>yes</code> and <code>nogo</code> becomes <code>no</code>).</p>
+</li>
+
+<li><code>to? <var>t-expr</var></code>
+<p>Converts a truth-value to a bit value (<code>yes</code> becomes <code>one</code> and <code>no</code> becomes <code>zero</code>).</p>
+</li>
+
+<li><code>P? <var>i-expr</var> [<var>t-var</var>]</code>
+<p>If the result of evaluating <var>i-expr</var> is a prime number, evaluates to that
+prime number (and has the type <code>p</code>).  If it is not prime, stores the value <code>no</code>
+into <var>t-var</var> and evaluates to 2.</p>
+</li>
+
+<li><code><var>i-expr</var> * <var>i-expr</var></code>
+<p>Evaluates to the product of the two <var>i-expr</var>s.  The result is never
+of type <code>p</code>, but the implementation doesn't need to do anything
+based on that fact.</p>
+</li>
+
+<li><code><var>i-expr</var> + <var>i-expr</var></code>
+<p>Evaluates to the sum of the two <var>i-expr</var>s.</p>
+</li>
+
+<li><code>exists/dynast <var>i-expr</var></code>
+<p>Evaluates to <code>one</code> if a dynast exists with the given label,
+or <code>zero</code> if one does not.</p>
+</li>
+
+<li><code>copy/dynast <var>i-expr</var>, <var>p-expr</var>, <var>p-expr</var></code>
+<p>Creates a new dynast based on an existing one.  The existing one is identified by
+the label given in the <var>i-expr</var>.  The new dynast is a copy of the existing
+dynast, but with a new label.  The new label is the sum of the two <var>p-expr</var>s.
+If a dynast with that label already exists, the program terminates.
+(<em>Defined in 1.1</em>) This expression evaluates to the value of the given <var>i-expr</var>.</p>
+</li>
+
+<li><code>create/countably/many/dynasts <var>i-expr</var>, <var>i-expr</var></code>
+<p>Creates a countably infinite number of dynasts based on an existing one.  The existing one is identified by
+the label given in the first <var>i-expr</var>.  The new dynasts are copies of the existing
+dynast, but with new labels.  The new labels start at the first odd integer
+greater than the second <var>i-expr</var>, and consist of every odd integer greater than that.
+If any dynast with such a label already exists, the program terminates.
+(<em>Defined in 1.1</em>) This expression evaluates to the value of the first given <var>i-expr</var>.</p>
+</li>
+
+<li><code><var>b-expr</var> and <var>b-expr</var></code>
+<p>Evaluates to <code>one</code> if both <var>b-expr</var>s are <code>true</code>,
+<code>zero</code> otherwise.  Note that this is not short-circuting; both <var>b-expr</var>s
+are evaluated.</p>
+</li>
+
+<li><code><var>c-expr</var> or <var>c-expr</var></code>
+<p>Evaluates to <code>yes</code> if either or both <var>c-expr</var>s are <code>go</code>,
+<code>no</code> otherwise.  Note that this is not short-circuting; both <var>c-expr</var>s
+are evaluated.</p>
+</li>
+
+<li><code>do <var>expr</var></code>
+<p>Evaluates the <var>expr</var>, throws away the result, and evaluates to <code>go</code>.</p>
+</li>
+
+<li><code><var>c-expr</var> then <var>expr</var></code>
+<p><strong>Wimpmode only.</strong>  Evaluates the <var>c-expr</var> on the left-hand side for its side-effects only,
+throwing away the result, then evaluates to the result of evaluating the right-hand
+side <var>expr</var>.</p>
+</li>
+
+<li><code><var>c-expr</var> ,then <var>i-expr</var></code>
+<p>(<em>New in 1.1</em>) Evaluates the <var>c-expr</var> on the left-hand side; if it is <code>go</code>,
+evaluates to the result of evaluating the right-hand side <var>i-expr</var>;
+if it is <code>nogo</code>, evaluates to an unspecified and quite possibly random
+integer between 1 and 1000000 inclusive, without evaluating the right-hand side.
+Note that this operator has the same precedence as <code>then</code>.</p>
+</li>
+
+<li><code>for each prime <var>var</var> below <var>i-expr</var> do <var>i-expr</var></code>
+<p>The <var>var</var> must be a declared variable of type <code>p</code>.  The first
+<var>i-expr</var> must evaluate to an integer, which we will call <var>k</var>.
+The second <var>i-expr</var> is evaluated once for each prime number between 
+<var>k</var> and 2, inclusive; each time it is evaluated, <var>var</var> is bound to
+a successively smaller prime number between <var>k</var> and 2.
+(<em>Defined in 1.1</em>)  Evaluates to the result of the final evaluation of the second <var>i-expr</var>.</p>
+</li>
+
+</ul>
+
+<h3>Grammar</h3>
+
+<p>This section attempts to capture and summarize the syntax rules (for a single
+parse stream) described above, using an EBNF-like syntax extended with a few
+ad-hoc annotations that I don't feel like explaining right now.</p>
+
+<pre>
+ParseStream  ::= VarDeclBlock {DynastLit}.
+VarDeclBlock ::= "VARIABLES ARE" VarDecl {"," VarDecl} ".".
+VarDecl      ::= TypeSpec VarName.
+TypeSpec     ::= "i" | "p" | "a" | "b" | "t" | "z" | "c".
+VarName      ::= "/" Pattern "/".
+Pattern      ::= {[a-zA-Z0-9 ]}
+               | Pattern "|" Pattern                    /* ignoring precedence here */
+               | Pattern "*"                            /* and here */
+               | "(" Pattern ")".
+DynastLit    ::= "dynast" "(" Gumber ")" "&lt;-&gt;" Expr.
+Expr         ::= Expr1[c] {"then" Expr1 | ",then" Expr1[i]}.
+Expr1        ::= Expr2[c] {"or" Expr2[c]}.
+Expr2        ::= Expr3[b] {"and" Expr3[b]}.
+Expr3        ::= Expr4[i] {"+" Expr4[i]}.
+Expr4        ::= Expr5[i] {"*" Expr5[i]}.
+Expr5        ::= Expr6[a] {"?" Expr6[i]}.
+Expr6        ::= Prim[a] {"[" Expr[i] "]"} [":=" Expr[i]].
+Prim         ::= {"("}* "." Expr "." {")"}*             /* remember the Fibonacci rule! */
+               | VarName [":=" Expr]
+               | Gumber
+               | "#myself#"
+               | "minus" Expr[i]
+               | "write" Expr[i]
+               | "#read#"
+               | "not?" Expr[z]
+               | "if?" Expr[b]
+               | "cvt?" Expr[c]
+               | "to?" Expr[t]
+               | "P?" Expr[i]
+               | "exists/dynast" Expr[i]
+               | "copy/dynast" Expr[i] "," Expr[p] "," Expr[p]
+               | "create/countably/many/dynasts"
+                   Expr[i] "," Expr[i]
+               | "do" Expr
+               | "for" "each" "prime" VarName "below"
+                   Expr[i] "do" Expr[i].
+Gumber       ::= {[0-9]}.
+</pre>
+
+<h3>Boolean Idioms</h3>
+
+<p>Here we show how we can get any value of any of the <code>b</code>, <code>t</code>,
+<code>z</code>, and <code>c</code> types, without any constants or variables with
+known values of these types.</p>
+
+<pre>
+VARIABLES ARE b /b*/.
+zero = /b*|b/ and not? to? cvt? if? /b*|b*/
+true = not? zero
+go = if? true
+yes = cvt? go
+one = to? yes
+false = not? one
+nogo = if? false
+no = cvt? nogo
+</pre>
+
+<h3>Computational Class</h3>
+
+<p>Because the single in-dynast looping construct, <code>for each prime below</code>, is
+always a finite loop, the execution of any fixed number of dynasts cannot be Turing-complete.
+We must create new dynasts at runtime, and continue execution in them, if we want
+any chance at being Turing-complete.  We demonstrate this by showing an example of
+a (conjecturally) infinite loop in Oozlybub and Murphy, an idiom which will doubtless