Commits

catseye  committed 7845891

Initial import of Treacle version 1.0 revision 2010.0427 sources.

  • Participants
  • Tags rel_1_0_2010_0427

Comments (0)

Files changed (9)

File doc/treacle.html

+<?xml version="1.0"?>
+<!-- encoding: UTF-8 -->
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+<head>
+  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+  <title>The Treacle Programming Language</title>
+  <!-- begin html doc dynamic markup -->
+  <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
+  <script type="text/javascript" src="/scripts/documentation.js"></script>
+  <!-- end html doc dynamic markup -->
+</head>
+<body>
+
+<h1>The Treacle Programming Language</h1>
+
+<p>Language version 1.0.<br/>
+Chris Pressey, Cat's Eye Technologies</p>
+
+<h2>Introduction</h2>
+
+<p><dfn>Treacle</dfn> is a programming language based on an extended form of term-rewriting
+which we shall call, somewhat innacurately (or at least arbitrarily,) <em>context rewriting</em>.</p>
+
+<p>Like Arboretuum, its successor built around <em>forest-rewriting</em>, Treacle was intended as
+a language for specifying compilers.  Treacle is somewhat more successful at pulling it off, however;
+context rewriting encompasses, and is more expressive than, forest-rewriting.</p>
+
+<p><dfn>Context rewriting</dfn> is meant to refer to the fact that Treacle's rewriting patterns
+may contain <dfn>holes</dfn> – designated "containers" for subpatterns which may match not just
+the <em>immediate</em> child of the term which the parent pattern matched (as in conventional term-rewriting)
+but also <em>any one of that child's descendents</em>, no matter how deeply nested.</p>
+
+<p>When a hole is matched to some term, that term is searched for the
+subpattern given inside the hole.  The search may be performed in either
+leftmost-innermost or leftmost-outermost order; this is specified
+by a qualifier associated with the hole.  Because of this, Treacle need not
+specify a language-wide reduction order; the hole construct acts as a kind of
+search operator which explicitly encodes search order into each pattern.</p>
+
+<p>Context rewriting also deconstructs the conventional concept of the variable,
+splitting it into a <dfn>name</dfn> and a <dfn>wildcard</dfn>.  Any pattern or
+subpattern may be named, not just wildcards.  Even holes may be named.  At the same time,
+wildcards, which match arbitrary terms, may occur unnamed.  Upon a successful match,
+only those terms which matched named patterns are recorded in the unifier.</p>
+
+<p>Further, each rule in Treacle may contain multiple terms (<dfn>replacements</dfn>) on the right-hand side
+of a rewriting rule, and each of these may have its own name.  When the term undergoing rewriting
+(called the <dfn>subject</dfn>) is rewritten, each named replacement is substituted into the subject at the position
+matched by the part of the pattern that is labelled by that same name.</p>
+
+<p>Lastly, replacements may contain special atomic terms called <dfn>newrefs</dfn>.  When a
+newref is written into the subject, it takes the form of a new, unique symbol, guaranteed (or at least
+reasonably assumed) to be different from all other symbols that are in, or could be in, the subject.
+When multiple newrefs (possibly in multiple replacements) in the same rule are written into the subject
+at the same time (i.e., on the same rewriting step,)
+they all take the same form (and so are equal to each other, and only to each other – nothing else.)  In Treacle's
+capacity as a compiler-definition language, newrefs are useful for generating internal labels for,
+e.g., translating control structures to machine code jumps.</p>
+
+<p>It is important to remember that, while subpatterns may be nested in holes, and these
+may in turn contain more holes, there is no corresponding hierarchical nature to the <em>bindings</em>
+which occur in Treacle patterns: all variables of the same name must unify to equivalent terms, regardless of where they
+occur in the pattern (inside or outside a hole.)</p>
+
+<h2>Syntax</h2>
+
+<p>We're almost ready to give some examples to elucidate all this, but first we need a
+syntax to give them in.  Here it is:</p>
+
+<ul>
+<li>atoms are denoted by strings of lower-case letters;</li>
+<li>terms are denoted by lists of subterms inside parentheses;</li>
+<li>named terms are denoted by <code>(? <var>name</var> <var>subterm</var>)</code>;</li>
+<li>holes are denoted by <code>(:i <var>subterm</var>)</code> or <code>(:o <var>subterm</var>)</code>,
+corresponding to innermost and outermost search order, respectively;</li>
+<li>wildcards are denoted by <code>*</code>;</li>
+<li>newrefs are denoted by <code>@</code>; and</li>
+<li>named replacements are denoted <code>X : <var>term</var></code>.</li>
+</ul>
+
+<h2>Examples</h2>
+
+<p>Now we are ready to give some examples.</p>
+
+<h3>Patterns</h3>
+
+<ul>
+<li>The pattern <code>(a b (? X *))</code> matches <code>(a b (c (d b)))</code>, with the unifier
+<code>X=(c (d b))</code>.  Also, <code>(a (? Y *) (c (d (? Y *))))</code> matches the same subject
+with <code>Y=b</code>.  This is all quite conventional.</li>
+
+<li>We can also match <code>(a (? X b) *)</code> to this subject.  The unifier will <em>always</em> be
+<code>X=b</code> when this pattern matches, regardless of the subject.  This tells us nothing we
+did not already know.  But it demonstrates the decoupling of names and wildcards in Treacle.  (It will also become useful when we get to replacements, since that atomic <code>b</code> term named by <code>X</code>
+can be supplanted by something: we have named not just a subterm, but a location in the subject.)</li>
+
+<li>The pattern <code>(a b (:i (d b)))</code> matches the subject as well.
+Observe how the hole allowed <code>(d b)</code> to be sought
+inside the subterm at the location where the hole matched.
+Note also that the pattern would just as easily match the subject
+<code>(a b (w x (w y (w z (d b)))))</code>, because it doesn't matter how
+deep <code>(d b)</code> is embedded in the subterm.</li>
+
+<li>If the pattern included a name, like <code>(a b (? X (:i (d b))))</code>,
+the match with the subject would result in the unifier <code>X=(c (d b))</code>.
+Likewise, the pattern <code>(a b (:i (? X (d b))))</code> would match the subject
+with the unifier <code>X=(d b)</code>.</li>
+
+<li>The pattern <code>(a (? X *) (:i (d (? X *))))</code> also matches the
+subject, with the unifier <code>X=b</code>.  This is a good example of the
+expressive power of pattern-matching in Treacle: we are basically asking
+to search the contents of the 3rd subterm, for whatever the 2nd subterm is.</li>
+
+</ul>
+
+<h3>Rules</h3>
+
+<ul>
+
+<li>Say we have a rule where the pattern is <code>(a b (:i (? X (d b))))</code>,
+and the lone replacement is <code>X : a</code>.  This rule would match the
+original subject <code>(a b (c (d b)))</code>, unifying with <code>X=(d b)</code>,
+and would rewrite the subject to <code>(a b (c a))</code>.</li>
+
+<li>Or, say our rule's pattern is <code>(a (? Y *) (:i (? X (d *))))</code>,
+and the set of replacements is {<code>X : (? Y)</code>, <code>Y : (? X)</code>}.
+This rule would also match the subject, with a unifier of {<code>X=(d b)</code>,
+<code>Y=b</code>}, and would rewrite the subject to <code>(a (d b) (c b))</code>.
+Again, notice the expressivity of this rule: we're basically asking Treacle to swap whatever
+occurs next to the <code>a</code>, with whatever occurs alongside a <code>d</code>
+somewhere inside the term that occurs next to that.</li>
+
+</ul>
+
+<h2>Mechanism</h2>
+
+<p>We can think of the mechanism by which context rewriting is undertaken, as follows.</p>
+
+<p>We pattern-match "as usual": recursively traverse the pattern and the subject.  Where there are
+literals in the pattern, we make sure those same values appear in the subject, in the same place.
+Where there are named subpatterns in the pattern, we bind the name to the position in the subject, and
+insert that binding into a unifier, before trying to match the subpattern to that position.
+(We do an occurs check first, to make sure that the name isn't already bound to something else.)</p>
+
+<p>Note that we bind the name, not to a subterm in the subject, but to a <em>position</em> in the subject.
+If you like, you can think of context rewriting building a "unifier by reference" rather than the rather
+more conventional "unifier by value".  This is useful, because the presence of holes means that we will
+have more of a need to know where we want to install a replacement.</p>
+
+<p>When we encounter a hole in the pattern, we take the subpattern that appears in the hole
+and begin searching for that subpattern in the subterm of the subject whose position corresponds
+to the hole.  We pass this subsearch our unifier (so that it can use the variable bindings already
+established for occurs checks.)  If the subsearch fails to match, then we also fail to match.
+If the subsearch succeeds, we continue the pattern-matching process with the unifier it produced.</p>
+
+<p>If everything succeeds, we have a unifier.  We go through the replacements, look up the name
+of each replacement in the unifier to find the location in the subject where it matched, expand all the
+variable names in the replacement with the contents of the unifier, and "poke" the expanded replacement
+into the subject at the location.</p>
+
+<h2>Implementation</h2>
+
+<p>Like Arboretuum, there is a reference implementation of Treacle in relatively pure Scheme, meant to
+normatively fill in any gaps in the description of the language given in this document.</p>
+
+<h2>Discussion</h2>
+
+<p>You may wonder, why forest-rewriting, or context rewriting?  To be sure, it does not add any
+computational power to term-rewriting, which is already Turing-complete.  But it does add a significant
+amount of expressiveness.  While this expressiveness seems to come at a signficant cost (at least,
+as imagined in a naïve implementation,) there are two advantages it might provide, one practical
+and one theoretical, which I'll get to in a second.</p>
+
+<p>The idea latent in forest-rewriting, which I didn't explain too well in the Arboretuum documentation,
+is to <em>partition the subject</em>.  Context rewriting continues and generalizes this idea;
+while in forest-rewriting it is obvious what the partitions are (named trees in a forest,) in
+context-rewriting, the partitions would be subterms of some given term (for example, the top-level
+term.)  An engine implementing context-rewriting might need some supplementary information
+or deductive ability in order to "see" and exploit the partitions, but they could nonetheless be
+identified.</p>
+
+<p>One major effect of partitioning is to ease the locality constraint.  If you've ever tried programming
+in pure term-rewriting, you notice that you have to "keep all your state together": if there are
+multiple pieces of information in the tree of terms that relate to the reduction you want to accomplish,
+they have to be in a bounded distance from, and in a fixed relationship with, each other.
+If some piece is far away, it will have to be brought – <em>literally</em> brought! –
+to bear on the situation, by moving it through the tree through successive "bubbling" rewrites.</p>
+
+<p>Forest-rewriting eases this by having multiple independent trees: some piece of information
+can be anywhere in some other tree.  Context rewriting eases it by having holes in which the
+piece of information can be found anywhere.</p>
+
+<p>Partitioning the subject could have the practical benefit of improving locality of reference
+in the rewriting engine.  Each partition can reside in its own memory buffer which is fixed in
+some way, for example in one or more cache lines.  Since we don't need to "bubble" information
+through the term, each partition can stay in its own cached area, and we should see fewer
+cache misses.</p>
+
+<p>Partitioning the subject could also have the theoretical benefit of making it easier
+to prove that the rewriting terminates,  If you look through some of the unit tests
+in <code>tests.scm</code>, you might notice that some of them go to some lengths to avoid
+rewriting certain trees to anything larger than they were.
+The size of each partition is then monotonically decreasing, and so it will eventually "run out",
+at which point the rewriting process must of course terminate.  We might not be able to
+achieve the ideal case where, on each rewrite, at least one of the partitions shrinks and the
+rest stay the same size. The closer we can come to it, however, the less burdensome should be
+the task of proving that the entire system terminates, because many of the cases should
+be trivial.</p>
+
+<p>Happy whacky rewriting all sorts of fun ways!
+<br/>Chris Pressey
+<br/>Chicago, Illinois
+<br/>April 12, 2008</p>
+
+</body>
+</html>

File src/index.scm

+;
+; Support for term indices (pointers to subterms of terms)
+; Chris Pressey, March 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+;
+; A term index is represented by a list of integers.  It uniquely
+; identifies a subterm position of a term.
+;
+
+;
+; Create a basic term index that refers to the entire term.
+;
+(define mk-base-index
+  (lambda ()
+    '()))
+
+;
+; Return a term index which points to the leftmost subterm of the
+; term pointed to by the given term index.
+;
+(define descend-index
+  (lambda (index)
+    (cons 0 index)))
+
+;
+; Return a term index which points to the closest sibling term to
+; the right of term pointed to by the given term index.
+;
+(define next-index
+  (lambda (index)
+    (cons (+ (car index) 1) (cdr index))))
+
+;
+; Retrieve the subterm of the given term at the given term index.
+;
+(define term-index-fetch
+  (lambda (term index)
+    (cond ((null? index)
+            term)
+          (else
+            (term-index-fetch (list-ref term (car index)) (cdr index))))))
+
+;
+; Return a new term where the subterm at the given term index is replaced
+; by the given replacement subterm.
+;
+(define term-index-store
+  (lambda (term index replacement)
+    (cond ((null? index)
+            replacement)
+          (else
+            (let* ((nth-subterm (list-ref term (car index)))
+                   (new-index   (cdr index))
+                   (new-subterm (term-index-store nth-subterm new-index replacement)))
+              (list-replace term (car index) new-subterm))))))
+
+;
+; Helper function for term-index-store.
+;
+(define list-replace
+  (lambda (elems pos elem)
+    (cond ((eq? pos 0)
+            (cons elem (cdr elems)))
+          (else
+            (cons (car elems) (list-replace (cdr elems) (- pos 1) elem))))))

File src/match.scm

+;
+; Support for matching context-rewriting patterns (names, wildcards, holes)
+; Chris Pressey, March 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+;
+; Shorthand for common usage of match.
+;
+(define toplevel-match
+  (lambda (subject pattern)
+    (match subject subject pattern (mk-empty-unifier) (mk-base-index))))
+
+;
+; Attempt to find a unifier (list of substitutions) which makes
+; the given pattern equal to the given term, and return it.
+; Return #f if the pattern does not match.
+;
+; Note that match, upon matching a hole, calls search-match.
+; This is mutually recursive, since search-match also calls match.
+;
+(define match
+  (lambda (subject term pattern unifier index)
+    (cond ((is-wildcard? pattern)
+            unifier)
+          ((is-named? pattern)
+            (let* ((name     (get-name pattern))
+                   (subpat   (get-named-subpat pattern))
+                   (submatch (match subject term subpat unifier index)))
+              (cond (submatch
+                      ; note that we pass the whole subject here
+                      (bind-name subject (reverse index) name submatch))
+                    (else
+                      #f))))
+          ((is-hole? pattern)
+            (let* ((order  (get-hole-order pattern))
+                   (subpat (get-hole-subpat pattern)))
+              (cond ((eq? order 'innermost)
+                      (search-match-innermost subject term subpat unifier index))
+                    ((eq? order 'outermost)
+                      (search-match-outermost subject term subpat unifier index))
+                    (else
+                      #f))))
+          ((and (list? term) (list? pattern))
+            (match-lists subject term pattern unifier (descend-index index)))
+          ((eqv? term pattern)
+            unifier)
+          (else
+            #f))))
+
+;
+; Helper function for match.
+; Given a term and a pattern, where we know both are lists,
+; fold over both of them, matching all the corresponding elements.
+;
+(define match-lists
+  (lambda (subject term pattern unifier index)
+    (cond ((and (null? term) (null? pattern))  ; end of both
+            unifier)
+          ((or (null? term) (null? pattern))   ; end of one but not the other
+            #f)
+          (else
+            (let ((new-unifier (match subject (car term) (car pattern) unifier index))
+                  (new-index   (next-index index)))
+              (if new-unifier
+                (match-lists subject (cdr term) (cdr pattern) new-unifier new-index)
+                #f))))))
+
+;
+; Match the given pattern to any subterm of the given term, if possible.
+; Give priority to the leftmost outermost subterm that matches.
+; Returns a new unifier, or #f.
+;
+; While searching, this tracks a term index which points to the subterm
+; that matches.  We pass this back in the returned unifier.
+;
+(define search-match-outermost
+  (lambda (subject term pattern unifier index)
+    (let* ((new-unifier (match subject term pattern unifier index)))
+      (cond (new-unifier
+              new-unifier)
+            ((list? term)
+              (search-match-list-outermost subject term pattern unifier (descend-index index)))
+            (else
+              #f)))))
+
+;
+; Helper function.  Try to match the given pattern to each term in the
+; given list, left to right.  Return new unifier on first successful
+; match, or #f is there is none.
+;
+(define search-match-list-outermost
+  (lambda (subject terms pattern unifier index)
+    (cond ((null? terms)
+            #f)
+          (else
+            (let* ((new-unifier (search-match-outermost subject (car terms) pattern unifier index))
+                   (new-index   (next-index index)))
+              (if new-unifier
+                new-unifier
+                (search-match-list-outermost subject (cdr terms) pattern unifier new-index)))))))
+
+;
+; Match the given pattern to any subterm of the given term, if possible.
+; Give priority to the leftmost innermost subterm that matches.
+; Returns a new unifier, or #f.
+;
+; While searching, this tracks a term index which points to the subterm
+; that matches.  We return this in the returned unifier.
+;
+(define search-match-innermost
+  (lambda (subject term pattern unifier index)
+    (cond ((list? term)
+            (let* ((new-unifier (search-match-list-innermost subject term pattern unifier (descend-index index))))
+              (if new-unifier
+                new-unifier
+                (match subject term pattern unifier index))))
+          (else
+            (match subject term pattern unifier index)))))
+
+;
+; Helper function.  Try to match the given pattern to each term in the
+; given list, left to right.  Return new unifier on first successful
+; match, or #f is there is none.
+;
+(define search-match-list-innermost
+  (lambda (subject terms pattern unifier index)
+    (cond ((null? terms)
+            #f)
+          (else
+            (let* ((new-unifier (search-match-innermost subject (car terms) pattern unifier index))
+                   (new-index   (next-index index)))
+              (if new-unifier
+                new-unifier
+                (search-match-list-innermost subject (cdr terms) pattern unifier new-index)))))))

File src/pattern.scm

+;
+; Basic data structures for context-rewriting patterns (names, holes, wildcards)
+; Chris Pressey, March 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+;
+; Patterns are structured as follows.
+;
+; Atoms, numbers, and lists are literals, to be matched.
+;
+; A vector of length 1 where the first and only element is the
+; atom 'wildcard' is a wildcard.
+;
+; A vector of length 3 where the first element is the atom 'named'
+; is a named term.  The second element is an atom giving the name
+; of the term, and the third element gives the subpattern so named.
+;
+; A vector of length 3 where the first element is the atom 'hole'
+; is a hole.  The second element indicates the match search order
+; to apply inside this hole, either 'innermost' or 'outermost'.  The
+; third element is the subpattern to be matched inside this hole.
+;
+
+(define mk-wildcard
+  (lambda ()
+    (vector 'wildcard)))
+
+(define mk-named
+  (lambda (name subpat)
+    (vector 'named name subpat)))
+
+(define mk-hole
+  (lambda (order subpat)
+    (vector 'hole order subpat)))
+
+(define mk-newref
+  (lambda ()
+    (vector 'newref)))
+
+;
+; Helper predicate for following predicates.
+;
+(define is-type?
+  (lambda (label pattern)
+    (and (vector? pattern)
+         (eq? (vector-ref pattern 0) label))))
+
+;
+; Return #t if the given pattern is a named pattern, #f otherwise.
+;
+(define is-named?
+  (lambda (pattern)
+    (is-type? 'named pattern)))
+
+;
+; Return the name of the given pattern variable.
+; Assumes that its input is in fact a pattern variable.
+;
+(define get-name
+  (lambda (named-pattern)
+    (vector-ref named-pattern 1)))     ; just return the 2nd element of the vector
+
+;
+; Return the name of the given pattern variable.
+; Assumes that its input is in fact a pattern variable.
+;
+(define get-named-subpat
+  (lambda (named-pattern)
+    (vector-ref named-pattern 2)))     ; just return the 3nd element of the vector
+
+;
+; Return #t if the given pattern is a hole, #f otherwise.
+;
+(define is-hole?
+  (lambda (pattern)
+    (is-type? 'hole pattern)))
+
+;
+; Return the search order of the given hole.
+; Assumes that its input is in fact a hole.
+;
+(define get-hole-order
+  (lambda (hole)
+    (vector-ref hole 1)))      ; just return the 2nd element of the vector
+
+;
+; Return the subpattern to search for in the given hole.
+; Assumes that its input is in fact a hole.
+;
+(define get-hole-subpat
+  (lambda (hole)
+    (vector-ref hole 2)))      ; just return the 3rd element of the vector
+
+;
+; Return #t if the given pattern is a wildcard, #f otherwise.
+;
+(define is-wildcard?
+  (lambda (pattern)
+    (is-type? 'wildcard pattern)))
+
+;
+; Return #t if the given pattern is a newref, #f otherwise.
+;
+(define is-newref?
+  (lambda (pattern)
+    (is-type? 'newref pattern)))
+
+;
+; Ground terms are a subset of patterns which may not contain
+; wildcards, holes, or named terms.
+;
+(define is-ground?
+  (lambda (term)
+    (cond
+      ((is-wildcard? term)
+        #f)
+      ((is-hole? term)
+        #f)
+      ((is-named? term)
+        #f)
+      ((is-newref? term)
+        #f)
+      ((null? term)
+        #t)
+      ((list? term)
+        (and (is-ground? (car term))
+             (is-ground? (cdr term))))
+      ((number? term)
+        #t)
+      ((symbol? term)
+        #t)
+      (else
+        #f))))
+
+;
+; Replacements are a subset of patterns which may contain named
+; terms, but may not contain wildcards or holes.
+;
+; In addition, replacements may contain newrefs, which are replaced
+; with unique symbols upon expansion.
+;
+(define is-replacement?
+  (lambda (term)
+    (cond
+      ((is-wildcard? term)
+        #f)
+      ((is-hole? term)
+        #f)
+      ((is-named? term)
+        (is-replacement? (get-named-subpat term)))
+      ((is-newref? term)
+        #t)
+      ((null? term)
+        #t)
+      ((list? term)
+        (and (is-replacement? (car term))
+             (is-replacement? (cdr term))))
+      ((number? term)
+        #t)
+      ((symbol? term)
+        #t)
+      (else
+        #f))))

File src/reduce.scm

+;
+; Support for reducing terms via context-rewriting
+; Chris Pressey, March 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+;
+; Given a rule (a pair of a pattern and a map of replacements,)
+; apply the rule to the given subject.  If the pattern part of
+; the rule matches the subject, replace the subterms that matched
+; named subpatterns with the expanded replacement whose key in
+; the map is that name.
+;
+(define apply-rule
+  (lambda (subject pattern replacements generation-id)
+    (let* ((unifier (match subject subject pattern (mk-empty-unifier) (mk-base-index))))
+      (if unifier
+        (apply-unifier subject subject unifier unifier replacements generation-id)
+        #f))))
+
+;
+; Helper function for apply-rule.  For each substitution in the unifier whose
+; name is present in some replacement, expand that replacement with values from
+; the unifier, and graft it into the subject at the position given in the unifier.
+;
+(define apply-unifier
+  (lambda (complete-subject subject complete-unifier unifier replacements generation-id)
+    (if (null? unifier)
+      subject
+      (let* ((unif-pair         (car unifier))
+             (rest-of-unif      (cdr unifier))
+             (name              (car unif-pair))
+             (index             (cdr unif-pair))
+             (repl-pair         (assq name replacements)))
+        (if repl-pair
+          (let* ((replacement   (cdr repl-pair))
+                 (expanded-repl (expand-vars complete-subject replacement complete-unifier generation-id))
+                 (new-subject   (term-index-store subject index expanded-repl)))
+            (apply-unifier complete-subject new-subject complete-unifier rest-of-unif replacements generation-id))
+          (apply-unifier complete-subject subject complete-unifier rest-of-unif replacements generation-id))))))
+
+;
+; Given a set of rules, apply repeatedly to subject until none apply.
+;
+(define reduce
+  (lambda (subject complete-rules rules generation-id)
+    (if (null? rules)
+      subject
+      (let* ((rule-pair     (car rules))
+             (rest-of-rules (cdr rules))
+             (pattern       (car rule-pair))
+             (replacements  (cdr rule-pair))
+             (new-gen-id    (+ generation-id 1))
+             (new-subject   (apply-rule subject pattern replacements generation-id)))
+        (if new-subject
+          (reduce new-subject complete-rules complete-rules new-gen-id)
+          (reduce subject complete-rules rest-of-rules new-gen-id))))))
+
+;
+; Useful shortcut for calling reduce.
+;
+(define toplevel-reduce
+  (lambda (subject complete-rules)
+    (reduce subject complete-rules complete-rules 0)))

File src/syntax.scm

+;
+; Provisional Syntax for Treacle Forms
+; Chris Pressey, April 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+(load "pattern.scm")
+
+;
+; Syntax for atomic terms, including patterns and replacements.
+;
+(define-syntax term-atom
+  (syntax-rules (* ? :i :o @)
+    ((term-atom *)
+      (mk-wildcard))
+    ((term-atom @)
+      (mk-newref))
+    ((term-atom (? name subterm))
+      (mk-named 'name (term-atom subterm)))
+    ((term-atom (:i subterm))
+      (mk-hole 'innermost (term-atom subterm)))
+    ((term-atom (:o subterm))
+      (mk-hole 'outermost (term-atom subterm)))
+    ((term-atom (inner ...))
+      (term-list inner ...))
+    ((term-atom other)
+      'other)))
+
+;
+; Syntax for list terms.
+;
+(define-syntax term-list
+  (syntax-rules ()
+    ((term-list)
+      '())
+    ((term-list atom rest ...)
+      (cons (term-atom atom) (term-list rest ...)))))
+
+;
+; Syntax for replacements.
+;
+(define-syntax replacements
+  (syntax-rules (:)
+    ((replacements)
+      '())
+    ((replacements name : replacement rest ...)
+      (cons (cons 'name (term-atom replacement)) (replacements rest ...)))
+  ))
+
+;
+; Syntax for rules.
+;
+(define-syntax rules
+  (syntax-rules (->)
+    ((rules)
+      '())
+    ((rules pattern -> (repls ...) rest ...)
+      (cons (cons (term-atom pattern) (replacements repls ...)) (rules rest ...)))
+  ))

File src/tests.scm

+;
+; Test suite for Treacle
+; Chris Pressey, March 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+(load "utils.scm")
+(load "pattern.scm");------------------------------------------------------
+
+(test pattern-1
+  (mk-named 'jim (mk-wildcard))
+  #(named jim #(wildcard))
+)
+
+(test pattern-2
+  (is-ground? (mk-named 'jim (mk-wildcard)))
+  #f
+)
+
+(test pattern-3
+  (is-ground? '(cat dog (rabbit) (oyster pigeon)))
+  #t
+)
+
+(test pattern-4
+  (is-replacement? (mk-named 'jim (mk-wildcard)))
+  #f
+)
+
+(test pattern-5
+  (is-replacement? (mk-named 'jim 0))
+  #t
+)
+
+(load "index.scm");--------------------------------------------------------
+
+(test index-fetch-1
+  (term-index-fetch '(1 2 3) '())
+  '(1 2 3)
+)
+
+(test index-fetch-2
+  (term-index-fetch '(1 2 3) '(0))
+  1
+)
+
+(test index-fetch-3
+  (term-index-fetch '(1 2 (1 (1 2 99) (1 2 3))) '(2 1 2))
+  99
+)
+
+(test index-store-1
+  (term-index-store '(1 2 3) '(0) 99)
+  '(99 2 3)
+)
+
+(load "unifier.scm");--------------------------------------------------------
+
+(test bind-name-1
+  (bind-name
+    '(a b c (d e f))
+    '(3)
+    'ralph
+    '()
+  )
+  '((ralph 3))
+)
+
+(test bind-name-2
+  (bind-name
+    '(a b c (d e f))
+    '(3)
+    'ralph
+    '((ralph 0))
+  )
+  #f
+)
+
+(test bind-name-3
+  (bind-name
+    '(a b c a)
+    '(3)
+    'ralph
+    '((ralph 0))
+  )
+  '((ralph 3) (ralph 0))
+)
+
+(test expand-vars-1
+  (expand-vars
+    '(a b c (d e f))
+    '(i j #(named ralph k))
+    '((ralph 3))
+    0
+  )
+  '(i j (d e f))
+)
+
+(test expand-vars-2
+  (expand-vars
+    '(a b c (d e f))
+    '(#(named ralph 0) #(named ed 0))
+    '((ralph 3 1))
+    0
+  )
+  '(e #(named ed 0))
+)
+
+(test expand-vars-3
+  (expand-vars
+    '(a b c (d e f))
+    '(#(newref))
+    '((ralph 3 1))
+    33
+  )
+  '(unique-ref-33)
+)
+
+(load "match.scm");-----------------------------------------------------------
+
+(test match-1
+  (toplevel-match
+    '(a (b c))
+    '(a (b c))
+  )
+  '()
+)
+
+(test match-2
+  (toplevel-match
+    '(a (b c))
+    '(b c)
+  )
+  #f
+)
+
+(test match-3
+  (toplevel-match
+    '(a (b c))
+    '(a #(named ralph #(wildcard)))
+  )
+  '((ralph 1))
+)
+
+(test match-4
+  (toplevel-match
+    '(x right (y 1 2))
+    '#(named t (x #(named i #(wildcard)) #(named j #(wildcard))))
+  )
+  '((t) (j 2) (i 1))
+)
+
+(test match-hole-1
+  (toplevel-match
+    '(a (b b b (c c c (d e)) b b b))
+    '(a #(hole innermost e))
+  )
+  '()
+)
+
+(test match-hole-2
+  (toplevel-match
+    '(a (b b b (c c c (d e)) b b b))
+    '(a #(hole innermost f))
+  )
+  #f
+)
+
+(test match-hole-3
+  (toplevel-match
+    '(a (b b (flag k) (c c c (d (flag a))) b b b))
+    '(a #(hole innermost (flag #(named jim #(wildcard)))))
+  )
+  '((jim 1 2 1))
+)
+
+(test match-hole-4
+  (toplevel-match
+    '(a (b b (flag k) (c c c (d (flag a))) b b b))
+    '(a #(hole innermost #(named jim (flag #(wildcard)))))
+  )
+  '((jim 1 2))
+)
+
+(test match-hole-5
+  (toplevel-match
+    '(a (b b (flag k) (c c c (d (flag a))) b b b))
+    '(a #(named jim #(hole innermost (flag #(wildcard)))))
+  )
+  '((jim 1))
+)
+
+(test match-hole-6
+  (toplevel-match
+    '(pair a (b b (flag k) (c c c (d (flag a))) b b b))
+    '(pair #(named jim #(wildcard)) #(hole innermost (flag #(named bones #(named jim #(wildcard))))))
+  )
+  '((bones 2 3 3 1 1) (jim 2 3 3 1 1) (jim 1))
+)
+
+(test match-order-1
+  (toplevel-match
+    '(thing (flag (world (a b c) (a b (flag k)))) thang)
+    #(hole innermost #(named jim (flag #(wildcard))))
+  )
+  '((jim 1 1 2 2))
+)
+
+(test match-order-2
+  (toplevel-match
+    '(thing (flag (world (a b c) (a b (flag k)))) thang)
+    #(hole outermost #(named jim (flag #(wildcard))))
+  )
+  '((jim 1))
+)
+
+(test match-order-3
+  (toplevel-match
+    '(ast (+ _ (* (lit 2) (lit 3))))
+    '(ast #(hole innermost #(named src (lit #(wildcard)))))
+  )
+  '((src 1 2 1))
+)
+
+(load "reduce.scm");----------------------------------------------------------
+
+(test apply-rule-1
+  (apply-rule
+    '(a b c)
+    #(named jim (a b c))
+    '((jim . k) (bones 1 2 3))
+    0
+  )
+  'k
+)
+
+(test apply-rule-2
+  (apply-rule
+    '(x this (x descends (x to (x the (x right (y 1 2))))))
+    '#(hole innermost #(named t (x #(named i #(wildcard)) #(named j #(wildcard)))))
+    '((t . (xx #(named j 0) #(named i 0))))
+    0
+  )
+  '(x this (x descends (x to (x the (xx (y 1 2) right)))))
+)
+
+(test reduce-1
+  (toplevel-reduce
+    '(a b c)
+    '(
+       ( #(named jim (a b c)) . ((jim . k) (bones 1 2 3)) )
+     )
+  )
+  'k
+)
+
+(test reduce-2
+  (toplevel-reduce
+    '(x this (x descends (x to (x the (x right (y 1 2))))))
+    '(
+      ( ; rule 1
+        #(hole innermost #(named t (x #(named i #(wildcard)) #(named j #(wildcard))))) .
+        ((t . (xx #(named j 0) #(named i 0))))
+      )
+     )
+  )
+  '(xx (xx (xx (xx (xx (y 1 2) right) the) to) descends) this)
+)
+
+(test reduce-3
+  (toplevel-reduce
+    '(x this (x descends (x to (x the (x right (y 1 2))))))
+    '(
+      ( ; rule 1
+        #(hole innermost #(named t (x #(named i #(wildcard)) #(named j #(wildcard))))) .
+        ((t . (xx #(named j 0) #(named i 0))))
+      )
+      ( ; rule 2
+        #(hole innermost #(named p right)) .
+        ((p . left))
+      )
+     )
+  )
+  '(xx (xx (xx (xx (xx (y 1 2) left) the) to) descends) this)
+)
+
+(load "syntax.scm");-----------------------------------------------------------
+
+(test syntax-term-1
+  (term-atom (a b c))
+  '(a b c)
+)
+
+(test syntax-term-2
+  (term-list a b c)
+  '(a b c)
+)
+
+(test syntax-term-3
+  (term-atom (a * c))
+  '(a #(wildcard) c)
+)
+
+(test syntax-term-4
+  (term-atom *)
+  #(wildcard)
+)
+
+(test syntax-term-5
+  (term-atom (a (? bob *) (c d @) f g))
+  '(a #(named bob #(wildcard)) (c d #(newref)) f g)
+)
+
+(test syntax-replacements-1
+  (replacements a : (a b @)  b : (? eb *))
+  '(
+     (a . (a b #(newref)))
+     (b . #(named eb #(wildcard)))
+   )
+)
+
+(test syntax-rules-1
+  (rules
+    (:i (? t (x (? i *) (? j *)))) -> ( t : (xx (? j 0) (? i 0))    )
+    (:i (? p right))               -> ( p : left )
+  )
+  '(
+    (
+      #(hole innermost #(named t (x #(named i #(wildcard)) #(named j #(wildcard))))) .
+      ((t . (xx #(named j 0) #(named i 0))))
+    )
+    (
+      #(hole innermost #(named p right)) .
+      ((p . left))
+    )
+  )
+)
+
+;-------------------------------------------------------------------
+; Forest-rewriting, a la Arboretuum.
+;-------------------------------------------------------------------
+
+(test rewrite-forest-1
+  (toplevel-reduce
+    '(forest (ast (+ (lit 3) (* (lit 2) (lit 3))))
+             (out halt))
+    '(
+       ( ; rule 1
+         (forest (ast #(hole innermost #(named src (lit #(named val #(wildcard))))))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (push #(named val) halt)))
+       )
+       ( ; rule 2
+         (forest (ast #(hole innermost #(named src (+ _ _))))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (add halt)))
+       )
+       ( ; rule 3
+         (forest (ast #(hole innermost #(named src (* _ _))))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (mul halt)))
+       )
+     )
+  )
+  '(forest (ast _) (out (push 3 (push 2 (push 3 (mul (add halt)))))))
+)
+
+(test rewrite-forest-2
+  (toplevel-reduce
+    '(forest (stab (a 4 eot))
+             (ast  (+ 1 2 3 a 5 6 a 7 8 9)))
+    '(
+       ( ; rule 1
+         (forest (stab #(hole innermost (#(named n #(wildcard)) #(named v #(wildcard)) #(named tab #(wildcard)))))
+                 (ast  #(hole innermost #(named dest #(named n #(wildcard)))))) .
+         ((dest . #(named v)))
+       )
+     )
+  )
+  '(forest (stab (a 4 eot)) (ast (+ 1 2 3 4 5 6 4 7 8 9)))
+)
+
+(test rewrite-forest-3
+  (toplevel-reduce
+    '(forest (ast (let a (lit 4) (+ (lit 3) (* (var a) (lit 3)))) )
+             (stab eot)
+             (out halt))
+    '(
+       ( ; rule 1
+         (forest (ast  #(hole innermost #(named src
+                          (let #(named n #(wildcard)) #(named v #(wildcard)) #(named expr #(wildcard)))  )))
+                 (stab #(hole innermost #(named dest  eot)))
+                 (out  #(wildcard))) .
+         ((src . #(named expr 0)) (dest . (#(named n 0) #(named v 0) eot)))
+       )
+       ( ; rule 2
+         (forest (ast  #(hole innermost #(named src (var #(named n #(wildcard))))))
+                 (stab #(hole innermost (#(named n #(wildcard)) #(named v #(wildcard)) #(wildcard))))
+                 (out  #(wildcard))) .
+         ((src . #(named v 0)))
+       )
+       ( ; rule 3
+         (forest (ast  #(hole innermost #(named src (lit #(named val #(wildcard))))))
+                 (stab #(wildcard))
+                 (out  #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (push #(named val) halt)))
+       )
+       ( ; rule 4
+         (forest (ast #(hole innermost #(named src (+ _ _))))
+                 (stab #(wildcard))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (add halt)))
+       )
+       ( ; rule 5
+         (forest (ast #(hole innermost #(named src (* _ _))))
+                 (stab #(wildcard))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (mul halt)))
+       )
+     )
+  )
+  '(forest (ast _)
+           (stab (a (lit 4) eot))
+           (out (push 3 (push 4 (push 3 (mul (add halt)))))))
+)
+
+;
+; This test is close to (although not exactly) what we'd like to see, for
+; translating "if" statements to machine code.  It uses newref to generate
+; labels for the jumps.  It rewrites the AST several times to ensure that
+; the jumps and labels are generated in the right order.
+;
+(test rewrite-forest-4
+  (toplevel-reduce
+    '(forest (ast (if (> (lit 6) (lit 4)) (print (lit 1)) (print (lit 2))) )
+             (out halt))
+    '(
+       ( ; rule -- get label for if
+         (forest (ast #(hole innermost #(named src (if _ #(named then #(wildcard)) #(named else #(wildcard)) ))))
+                 (out #(wildcard))) .
+         ((src . (iflab #(named then 0) #(named else 0) #(newref))))
+       )
+       ( ; rule -- reduce if to then
+         (forest (ast #(hole innermost #(named src
+                         (iflab #(named then #(wildcard)) #(named else #(wildcard)) #(named elselab #(wildcard)))
+                 )))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . (then #(named then 0) #(named else 0) #(named elselab 0)))
+          (dest . (jmp-if-false #(named elselab 0) halt)))
+       )
+       ( ; rule -- reduce then to else
+         (forest (ast #(hole innermost #(named src (then _ #(named else #(wildcard)) #(named elselab #(wildcard))))))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . #(named else 0)) (dest . (label #(named elselab 0) halt)))
+       )
+       ( ; rule -- translate operator
+         (forest (ast #(hole innermost #(named src (> _ _))))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (gt halt)))
+       )
+       ( ; rule -- translate command
+         (forest (ast #(hole innermost #(named src (print _))))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (print halt)))
+       )
+       ( ; rule -- translate literal
+         (forest (ast #(hole innermost #(named src (lit #(named val #(wildcard))))))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (push #(named val) halt)))
+       )
+     )
+  )
+  '(forest (ast _)
+           (out (push 6 (push 4 (gt (jmp-if-false unique-ref-16
+                                     (push 1 (print (label unique-ref-16 (push 2 (print halt)))))))))))
+)
+
+;
+; This test is pretty much exactly what we'd like to see for translation of
+; "if" statements to machine code.  It relies on the fact that all newrefs
+; in a replacement generate the same new reference.  It also uses an auxilliary
+; tree, the bpt (branch point table) instead of rewriting the main AST to
+; clarify somewhat the dependencies.
+;
+(test rewrite-forest-5
+  (toplevel-reduce
+    '(forest (ast (if (> (lit 6) (lit 4)) (print (lit 1)) (print (lit 2))) )
+             (bpt eot)
+             (out halt))
+    '(
+       ( ; rule -- get label for if
+         (forest (ast #(hole innermost #(named src (if _ #(named then #(wildcard)) #(named else #(wildcard)) ))))
+                 (bpt #(hole innermost #(named branch eot)))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((branch . (then #(newref))) (dest . (jmp-if-false #(newref) halt)))
+       )
+       ( ; rule -- get label for if
+         (forest (ast #(hole innermost #(named src (if _ _ #(named else #(wildcard)) ))))
+                 (bpt #(hole innermost #(named branch (then #(named ref #(wildcard))) eot)))  ; XXX???
+                 (out #(hole innermost #(named dest halt)))) .
+         ((branch . (else #(newref))) (dest . (goto #(newref) (label #(named ref 0) halt))))
+       )
+       ( ; rule -- get label for if
+         (forest (ast #(hole innermost #(named src (if _ _ _) )))
+                 (bpt #(hole innermost #(named branch (else #(named ref #(wildcard))) eot)))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (branch . eot) (dest . (label #(named ref 0) halt)))
+       )
+       ( ; rule -- translate operator
+         (forest (ast #(hole innermost #(named src (> _ _))))
+                 (bpt #(wildcard))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (gt halt)))
+       )
+       ( ; rule -- translate command
+         (forest (ast #(hole innermost #(named src (print _))))
+                 (bpt #(wildcard))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (print halt)))
+       )
+       ( ; rule -- translate literal
+         (forest (ast #(hole innermost #(named src (lit #(named val #(wildcard))))))
+                 (bpt #(wildcard))
+                 (out #(hole innermost #(named dest halt)))) .
+         ((src . _) (dest . (push #(named val 0) halt)))
+       )
+     ))
+  '(forest (ast _)
+           (bpt eot)
+           (out (push 6 (push 4 (gt (jmp-if-false unique-ref-16
+                  (push 1 (print (goto unique-ref-29
+                  (label unique-ref-16 (push 2 (print
+                  (label unique-ref-29 halt)))))))))))))
+)
+
+; Treacle syntax for previous test.
+
+(test rewrite-forest-6
+  (toplevel-reduce
+    '(forest (ast (if (> (lit 6) (lit 4)) (print (lit 1)) (print (lit 2))) )
+             (bpt eot)
+             (out halt))
+    (rules
+      (forest (ast (:i (? src (if _ (? then *) (? else *)))))
+              (bpt (:i (? branch eot)))
+              (out (:i (? dest halt))))
+      -> ( branch : (then @)  dest : (jmp-if-false @ halt) )
+
+      (forest (ast (:i (? src (if _ _ (? else *)))))
+              (bpt (:i (? branch (then (? ref *)))))
+              (out (:i (? dest halt))))
+      -> ( branch : (else @)  dest : (goto @ (label (? ref *) halt)) )
+
+      (forest (ast (:i (? src (if _ _ _))))
+              (bpt (:i (? branch (else (? ref *)))))
+              (out (:i (? dest halt))))
+      -> ( src : _  branch : eot  dest : (label (? ref *) halt) )
+
+      (forest (ast (:i (? src (> _ _ ))))
+              (bpt *)
+              (out (:i (? dest halt))))
+      -> ( src : _  dest : (gt halt) )
+
+      (forest (ast (:i (? src (print _))))
+              (bpt *)
+              (out (:i (? dest halt))))
+      -> ( src : _  dest : (print halt) )
+
+      (forest (ast (:i (? src (lit (? val *)))))
+              (bpt *)
+              (out (:i (? dest halt))))
+      -> ( src : _  dest : (push (? val *) halt) )
+
+    ))
+  '(forest (ast _)
+           (bpt eot)
+           (out (push 6 (push 4 (gt (jmp-if-false unique-ref-16
+                  (push 1 (print (goto unique-ref-29
+                  (label unique-ref-16 (push 2 (print
+                  (label unique-ref-29 halt)))))))))))))
+)

File src/unifier.scm

+;
+; Support for matching of patterns containing contexts (holes)
+; Chris Pressey, March 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+;
+; A traditional unifier is a set of (variable name, value) pairs indicating
+; what value is bound to each variable name.  In our case, unifiers contain
+; (name, term index) pairs which bind named to indices into the subject term.
+; In a sense, conventional unifiers are unifiers "by value" while Treacle's
+; are unifiers "by reference".
+;
+; Note also that in these unifiers, the same name can be bound to *multiple*
+; positions within the subject, since the name may occur in any number of
+; positions in the pattern, and will match as long as those subterms are
+; equivalent.
+;
+
+;
+; Create and return a new, empty unifier.
+;
+(define mk-empty-unifier
+  (lambda ()
+    '()))
+
+;
+; Extend the given unifier to one where the given name is associated with
+; the given term index in the given subject term.  If such an extension is
+; not possible (i.e. the name is already bound to an inequivalent term at
+; a different index in the subject,) then #f is returned.
+;
+(define bind-name
+  (lambda (subject index name unifier)
+    (if (scour-unifier? subject index name unifier)
+      (cons (cons name index) unifier)
+      #f)))
+
+;
+; Helper function for bind-name.  Returns #t if it's OK to extend the
+; unifier with the given name->index association, #f otherwise
+;
+(define scour-unifier?
+  (lambda (subject index name unifier)
+    (cond
+      ((null? unifier)
+        #t)
+      (else
+        (let* ((pair        (car unifier))
+               (bound-name  (car pair))
+               (bound-index (cdr pair)))
+          (cond
+            ((not (eq? name bound-name))
+              (scour-unifier? subject index name (cdr unifier)))
+            ((eqv? index bound-index) ; already bound to same place: ok
+              (scour-unifier? subject index name (cdr unifier)))
+            ((eqv? (term-index-fetch subject index) ; already bound to equiv
+                   (term-index-fetch subject (cdr pair))) ; term: alright
+              (scour-unifier? subject index name (cdr unifier)))
+            (else                ; already bound to something else: not good
+              #f)))))))
+
+;
+; Given a subject, a replacement, and a unifier, return a term which is like
+; the replacement except where where each of the placeholders in the replacement
+; has been replaced by the associated term referenced in the unifier.
+;
+(define expand-vars
+  (lambda (subject replacement unifier generation-id)
+    (cond ((is-named? replacement)      ; variable - replace if in unifier
+            (let* ((pair (assq (get-name replacement) unifier)))
+              (cond ((pair? pair)
+                      (term-index-fetch subject (cdr pair)))
+                    (else
+                      replacement))))
+          ((is-newref? replacement)
+            (string->symbol (string-append "unique-ref-" (number->string generation-id))))
+          ((list? replacement)          ; list - recurse
+            (map (lambda (subpattern)
+                   (expand-vars subject subpattern unifier generation-id))
+                 replacement))
+          (else                         ; ground term - leave it alone.
+            replacement))))

File src/utils.scm

+;
+; Utility functions used by Treacle
+; Chris Pressey, March 2008
+;
+
+; Copyright (c)2008 Cat's Eye Technologies.  All rights reserved.
+;
+; Redistribution and use in source and binary forms, with or without
+; modification, are permitted provided that the following conditions
+; are met:
+;
+; 1. Redistributions of source code must retain the above copyright
+;    notices, this list of conditions and the following disclaimer.
+; 2. Redistributions in binary form must reproduce the above copyright
+;    notices, this list of conditions, and the following disclaimer in
+;    the documentation and/or other materials provided with the
+;    distribution.
+; 3. Neither the names of the copyright holders nor the names of their
+;    contributors may be used to endorse or promote products derived
+;    from this software without specific prior written permission. 
+;
+; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+; ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES INCLUDING, BUT NOT
+; LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+; FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+; COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+; BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+; CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+; LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+; ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+; POSSIBILITY OF SUCH DAMAGE.
+
+;
+; Debugging output.
+;
+(define-syntax print
+  (syntax-rules ()
+    ((print e)
+      (display e))
+    ((print e1 e2 ...)
+      (begin (display e1)
+             (print e2 ...)))))
+
+(define-syntax println
+  (syntax-rules ()
+    ((println e)
+      (begin (display e)
+             (newline)))
+    ((println e1 e2 ...)
+      (begin (display e1)
+             (println e2 ...)))))
+
+;
+; Testing framework.
+;
+(define-syntax test
+  (syntax-rules ()
+    ((test test-name expr expected)
+      (begin
+        (print "Running test: " (quote test-name) "... ")
+        (let ((result expr))
+          (cond
+            ((equal? result expected)
+              (println "passed."))
+            (else
+              (println "FAILED!")
+              (println "Expected: " expected)
+              (println "Actual:   " result))))))))