Commits

Anonymous committed 16d7296

Convert documentation to Markdown format.

Comments (0)

Files changed (2)

+SMITH
+=====
+
+**S**elf-**M**odifying **I**ndecent **T**uring **H**ack
+
+Version 2.1-2011.0922. ©2000-2011 Cat's Eye Technologies. All rights
+reserved.
+
+* * * * *
+
+> "This space intentionally left blank."
+>
+> — MIT's *Dungeon* (later Infocom's *Zork I*)
+
+> "The only thing that really worried me was the ether. There is nothing
+> in the world more helpless and irresponsible and depraved than a man
+> in the depths of an ether binge. And I knew we'd get into that rotten
+> stuff pretty soon."
+>
+> — Hunter S. Thompson, *Fear and Loathing in Las Vegas*
+
+### What is SMITH?
+
+SMITH, the successor to [SMETANA](/projects/smetana/), is a programming
+language that goes one better than
+[Bullfrog](http://members.tripod.com/rkusnery/bullfrog.html). Not only
+are there no conditional jumps, there are no jumps **whatsoever**. The
+program counter can *only* be incremented and *only* instruction by
+instruction.
+
+### What the...???
+
+*But*, you ask, *how can such an ungodly beast be Turing-Complete?
+Surely this goes against all that is decent and good and sane in the
+world we know.* (OK, maybe you won't put it exactly that way - I forgive
+you.)
+
+Well, there's actually a relatively straightforward answer. Instead of
+jumping back into code to repeat and/or recheck something, you must
+instead copy blocks of code forward, where they will be checked in the
+near future.
+
+And if the code copied forward includes the instruction that *caused*
+the copy in the first place, that will happen again (and again and
+again) when the copy executes. A loop, but not by means of iteration or
+recursion but instead by *self-propogation*.
+
+Otherwise, SMITH is basically a small generic imperative CISC-ish
+virtual machine with only the most basic arithmetic operations, loosely
+modelled on the "[imaginary] register machine that is representative of
+several minicomputers" described in section 9.2 of the Dragon Book.
+
+And when I say loosely modelled, I mean it. Differences between this and
+the 'Dragon' machine include the fact that this has no `ADD`
+instruction, and that here, the destination is specified *before* the
+source (not after;) also, the `#` before an immediate (literal) is
+allowed but not required here, and the indirection syntax is completely
+different (there's only indirect references to registers, not memory,
+anyway.)
+
+### Overview
+
+Consider the SMITH machine to have an unlimited number of registers.
+These are called `R0` to `Rn` where no arbitrary limit is imposed on
+`n`. In some instructions, they may be referenced indirectly, and in
+that case they are called `R[R0]` to `R[Rn]`.
+
+An indirect reference like `R[R0]` refers to 'the register whose number
+is the number stored in register zero.' If `R0` contained 7, then
+`R[R0]` would be synonymous with `R7`.
+
+Each instruction has the general form:
+
+      OPCODE [destination[, source[, length]]]
+
+(Here, the square breackets denote "may be omitted", not indirect
+references.)
+
+`destination` is either the name of a register or an immediate offset
+from the current program counter. `source` is either a register, an
+immediate offset from the current program counter, or an immediate
+integer. `length`, if present, is the name of a register. All integers
+are notated in decimal. `#` may precede immediate integers if the
+programmer desires. However, it may not precede offsets (they should be
+preceded by either `+` or `-`.)
+
+### Instructions
+
+The basic instructions (available in the original version) are:
+
+      MOV register, immediate       e.g.  MOV R0, 0
+      MOV register, register              MOV R1, R0
+
+      SUB register, immediate             SUB R1, 1
+      SUB register, register              SUB R0, R1
+
+      MUL register, immediate             MUL R0, 2
+
+      NOT register                        NOT R0
+
+      COR offset, offset, register        COR +1, -5, R0
+
+      STOP                                STOP
+
+The instructions in the following table were made available in SMITH v2
+(June 25 2000).
+
+      MOV [register], register            MOV R[R1], R0
+      MOV register, [register]            MOV R1, R[R0]
+
+      MOV [register], "string"            MOV R[R0], "nights and weird mornings"
+
+      MOV register, PC                    MOV R303, PC
+      MOV register, *                     MOV R304, *
+
+      MOV TTY, register                   MOV TTY, R1
+      MOV TTY, [register]                 MOV TTY, R[R65535]
+      MOV register, TTY                   MOV R0, TTY
+      MOV [register], TTY                 MOV R[R0], TTY
+
+      MUL register, register              MUL R999999999, R123456789
+
+      COR offset, register, register      COR +1, R22, R23
+
+      BLA offset, NOP, register           BLA +2, NOP, R10
+
+      NOP                                 NOP
+
+SMITH v2 also added two directives:
+
+      ; arbitrary text composing a source comment               ; Kilroy was here
+      REP int OPCODE [destination[, source[, length]]]          REP 50 STOP
+
+### Explanation
+
+`MOV` will assign a register a value without modifying it.
+
+The `TTY` forms of `MOV` are just a lazy way to refer to some
+(potentially complex) system routine for the 'usual' input and output
+character streams, so the program can communicate (in some primitive
+fashion) with the user.
+
+The `PC` and `*` forms of `MOV` facilitate calculating absolute
+addresses of subroutines. Actually, `*` is more like a macro which
+expands to the current line number in the source file.
+
+The `"string"` form of `MOV` moves each character of the immediate
+string into a successive register. If `R0` contained 7, then
+`MOV R[R0], "cat"` would result in `R7 == 'c', R8 == 'a', R9 == 't'`.
+
+`SUB` and `MUL` will subtract or multiply a register by a value,
+respectively. `NOT` will boolean-negate a register (false = 0/zero, true
+= 1/nonzero.)
+
+`COR` is the truly interesting opcode; it stands for `CO`py by
+`R`egister. The value of the length operand (which is always in a
+register) is a count of instructions. This many instructions is copied
+from the source offset (which may be immediate or in a register) to the
+destination offset (which is always immediate), modifying the program
+(thus, self-modifying code.)
+
+> Note: prior to version 2007.0722 of the SMITH language, the result of
+> actually **overwriting** existing instructions with other instructions
+> was not defined. The original idea was that SMITH programs would
+> merely **extend** themselves by copying bits of themselves to just
+> past the end of the program, like Sylvester chasing Tweety around the
+> Christmas tree on a toy train -- and I still think that's all that's
+> needed for SMITH to be Turing-complete. However, since the language
+> does reasonably claim to be fully "self-modifying" and not just
+> "self-extending" (SEITH?!?), it's official: as of version 2007.0722,
+> you can overwrite instructions with other instructions using `COR`.
+> Even when the source and destination ranges of such a copy overlap,
+> the resulting series of instructions at the destination should be an
+> exact copy of the series at the source.
+>
+> > Note note: It seems some of you are having trouble understanding the
+> > version system being used for SMITH, which is eminently
+> > understandable given that it is completely inconsistent. The first
+> > version had no version. The second version, released a scant few
+> > days later, containing many new instructions, was "v2". That version
+> > was re-christened "version 2.0L" when we added the BSD license to it
+> > (because "L" is for "license", y'see?) The third version, released a
+> > scant few years later, was "version 2007.0722". This was back when
+> > we thought date-based version numbers could substitute entirely for
+> > sequence-based version numbers. This version was also known as
+> > "version 2.1 revision 2007.0722" when we dropped that idea. The
+> > current version, which differs mainly in that its documentation
+> > contains this paragraph, is version **2.1-2011.0922**. Your
+> > knowledge is now complete. Go home.
+
+`BLA` is kind of a special version of `COR` that makes some programming
+a lot easier. It stands for `BLA`nk and as part of it's very special
+format it takes an *immediate opcode* as its second argument. It copies
+that opcode (which must be either `NOP` or `STOP`) to the offset, and
+uses the number found in the length register as a multiplier.
+
+Execution starts at the beginning of the program (well, naturally,) and
+stops when the program counter tries to execute beyond the last
+instruction specified in the program. A `STOP` instruction will also
+serve this purpose, if you choose to use it.
+
+### Huh.
+
+So there you go. Myth debunked. You don't need `if` or `goto` or `while`
+or anything so fancy to be Turing-Complete. All you need is to be able
+to `memcpy` yourself! Don't that just take the cake?

doc/smith.html

-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<!-- encoding: UTF-8 -->
-<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
-<head>
-  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-  <meta name="Description" content="Cat's Eye Technologies: The SMITH Programming Language" />
-  <meta name="Keywords" content=" Cat's Eye Technologies Language
- Programming computational computation self-modification self-modifying
- code self-copying self-propogation self-perpetuation infinite tape
- half-infinite Turing tape Turing-Complete experimental indecent hack " />
-  <title>The SMITH Programming Language</title>
-  <!-- begin html doc dynamic markup -->
-  <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
-  <script type="text/javascript" src="/scripts/documentation.js"></script>
-  <!-- end html doc dynamic markup -->
-</head>
-<body>
-
-<h1>SMITH</h1>
-
-<p><b>S</b>elf-<b>M</b>odifying <b>I</b>ndecent <b>T</b>uring <b>H</b>ack</p>
-
-<p>Version 2.1-2011.0922.  ©2000-2011 Cat's Eye Technologies.  All rights reserved.</p>
-    
-<hr />
-
-<blockquote>
-<p style="font-style: italic; margin-bottom: 0">"This space intentionally left blank."</p>
-<p style="font-style: normal; margin-left: 2em; margin-top: 0.1em">— MIT's <i>Dungeon</i> (later Infocom's <i>Zork I</i>)</p>
-</blockquote>
-
-<blockquote>
-<p style="font-style: italic; margin-bottom: 0">"The only thing that really worried me was the ether.  There is
- nothing in the world more helpless and irresponsible and
- depraved than a man in the depths of an ether binge.  And I knew
- we'd get into that rotten stuff pretty soon."</p>
-<p style="font-style: normal; margin-left: 2em; margin-top: 0.1em">— Hunter S. Thompson, <i>Fear and Loathing in Las Vegas</i></p>
-</blockquote>
-
-<h3>What is SMITH?</h3>
-
-<p>SMITH, the successor to <a href="/projects/smetana/">SMETANA</a>,
-is a programming language that goes one better than <a class="external"
- href="http://members.tripod.com/rkusnery/bullfrog.html">Bullfrog</a>.
-
-Not only are there no conditional jumps,
-there are no jumps <strong>whatsoever</strong>.  The program counter
-can <i>only</i> be incremented and <i>only</i> instruction by
-instruction.</p>
-
-<h3>What the...???</h3>
-
-<p><i>But</i>, you ask, <i>how can such an ungodly beast be Turing-Complete?
-Surely this goes against all that is decent and good and sane in the world we know.</i>
-(OK, maybe you won't put it exactly that way - I forgive you.)</p>
-
-<p>Well, there's actually a relatively straightforward answer.
-Instead of jumping back into code to repeat and/or recheck
-something, you must instead copy blocks of code forward, where
-they will be checked in the near future.</p>
-
-<p>And if the code copied forward includes the instruction
-that <i>caused</i> the copy in the first place, that will happen
-again (and again and again)
-when the copy executes.  A loop, but not by means of iteration or
-recursion but instead by <i>self-propogation</i>.</p>
-
-<p>Otherwise, SMITH is basically a small generic imperative 
-CISC-ish
-virtual machine with only the most basic arithmetic operations,
-loosely modelled on the "[imaginary] register machine that is
-representative of several minicomputers" described in section 9.2
-of the Dragon Book.</p>
-
-<p>And when I say loosely modelled, I mean it.  Differences between
-this and the 'Dragon' machine include the fact that this has no <code>ADD</code>
-instruction, and that here, the destination is specified <i>before</i>
-the source (not after;) also, the <code>#</code> before an immediate
-(literal) is allowed but not required here, and the indirection
-syntax is completely different (there's only indirect references to
-registers, not memory, anyway.)</p>
-
-<h3>Overview</h3>
-
-<p>Consider the SMITH machine to have an unlimited number of
-registers.  These are called <code>R0</code> to <code>R<i>n</i></code> where
-no arbitrary limit is imposed on <code><i>n</i></code>.  In some instructions,
-they may be referenced indirectly, and in that case they are called
-<code>R[R0]</code> to <code>R[R<i>n</i>]</code>.</p>
-
-<p>An indirect reference like <code>R[R0]</code> refers to 'the register
-whose number is the number stored in register zero.'  If <code>R0</code>
-contained 7, then <code>R[R0]</code> would be synonymous with <code>R7</code>.</p>
-
-<p>Each instruction has the general form:</p>
-
-<pre>  OPCODE [destination[, source[, length]]]</pre>
-
-<p>(Here, the square breackets denote "may be omitted", not indirect references.)</p>
-
-<p><code>destination</code> is either the name of a register or an immediate
-offset from the current program counter.  <code>source</code> is either a
-register, an immediate offset from the current program counter, or an
-immediate integer.  <code>length</code>, if present, is the name of a
-register.  All integers are notated in decimal.  <code>#</code> may precede
-immediate integers if the programmer desires.  However, it may not
-precede offsets (they should be preceded by either <code>+</code> or <code>-</code>.)</p>
-
-<h3>Instructions</h3>
-
-<p>The basic instructions (available in the original version) are:</p>
-
-<pre>  MOV register, immediate       e.g.  MOV R0, 0
-  MOV register, register              MOV R1, R0
-
-  SUB register, immediate             SUB R1, 1
-  SUB register, register              SUB R0, R1
-
-  MUL register, immediate             MUL R0, 2
-
-  NOT register                        NOT R0
-
-  COR offset, offset, register        COR +1, -5, R0
-
-  STOP                                STOP
-</pre>
-
-<p>The instructions in the following table
-were made available in SMITH v2 (June 25 2000).</p>
-
-<pre>  MOV [register], register            MOV R[R1], R0
-  MOV register, [register]            MOV R1, R[R0]
-
-  MOV [register], "string"            MOV R[R0], "nights and weird mornings"
-
-  MOV register, PC                    MOV R303, PC
-  MOV register, *                     MOV R304, *
-
-  MOV TTY, register                   MOV TTY, R1
-  MOV TTY, [register]                 MOV TTY, R[R65535]
-  MOV register, TTY                   MOV R0, TTY
-  MOV [register], TTY                 MOV R[R0], TTY
-
-  MUL register, register              MUL R999999999, R123456789
-
-  COR offset, register, register      COR +1, R22, R23
-
-  BLA offset, NOP, register           BLA +2, NOP, R10
-
-  NOP                                 NOP
-</pre>
-
-<p>SMITH v2 also added two directives:</p>
-
-<pre>  ; arbitrary text composing a source comment               ; Kilroy was here
-  REP int OPCODE [destination[, source[, length]]]          REP 50 STOP
-</pre>
-
-<h3>Explanation</h3>
-
-<p><code>MOV</code> will assign a register a value without modifying it.</p>
-
-<p>The <code>TTY</code> forms of <code>MOV</code> are just a lazy way to refer to some
-(potentially complex) system routine for the 'usual' input and output character streams,
-so the program can communicate (in some primitive fashion) with the user.</p>
-
-<p>The <code>PC</code> and <code>*</code> forms of <code>MOV</code> facilitate calculating
-absolute addresses of subroutines.  Actually, <code>*</code> is more like a macro
-which expands to the current line number in the source file.</p>
-
-<p>The <code>"string"</code> form of <code>MOV</code> moves each character of the immediate
-string into a successive register.  If <code>R0</code> contained 7, then <code>MOV R[R0], "cat"</code>
-would result in <code>R7 == 'c', R8 == 'a', R9 == 't'</code>.</p>
-
-<p><code>SUB</code> and
-<code>MUL</code> will subtract or multiply a register by a value, respectively.
-<code>NOT</code> will boolean-negate a register (false = 0/zero, true = 1/nonzero.)</p>
-
-<p><code>COR</code> is the truly interesting opcode; it stands for
-<code>CO</code>py by <code>R</code>egister.
-The value of the length operand (which is always in a register) is a count of instructions.
-This many instructions is copied from the source offset (which may be immediate or in a register)
-to the destination offset (which is always immediate),
-modifying the program (thus, self-modifying code.)</p>
-
-<blockquote style="font-style: italic; margin-left: 2em;"><p>Note: prior to version 2007.0722 of the SMITH language, the result of actually
-<b>overwriting</b> existing instructions with other instructions was not defined.
-The original idea was that SMITH programs would merely <b>extend</b> themselves
-by copying bits of themselves to just past the end of the program, like Sylvester
-chasing Tweety around the Christmas tree on a toy train --
-and I still think that's all that's needed for SMITH to be Turing-complete.
-However, since the language does reasonably claim to be fully "self-modifying" and not just
-"self-extending" (SEITH?!?), it's official: as of version 2007.0722, you can overwrite
-instructions with other instructions using <code>COR</code>.  Even when the source and
-destination ranges of such a copy overlap, the resulting series of instructions at the
-destination should be an exact copy of the series at the source.</p>
-
-  <blockquote>
-    <p>Note note: It seems some of you are having trouble understanding the version
-    system being used for SMITH, which is eminently understandable given that it is
-    completely inconsistent.  The first version had no version.  The second version, released
-    a scant few days later, containing many new instructions, was "v2".  That version was re-christened "version 2.0L" when we
-    added the BSD license to it (because "L" is for "license", y'see?)  The third version, released a scant few years later, was
-    "version 2007.0722".  This was back when we thought date-based version numbers could
-    substitute entirely for sequence-based version numbers.  This version was also known
-    as "version 2.1 revision 2007.0722" when we dropped that idea.  The current version,
-    which differs mainly in that its documentation contains this paragraph, is
-    version <b>2.1-2011.0922</b>.  Your knowledge is now complete.  Go home.</p>
-  </blockquote>
-</blockquote>
-
-<p><code>BLA</code> is kind of a special version of <code>COR</code> that makes some
-programming a lot easier.  It stands for <code>BLA</code>nk and as part of it's very
-special format it takes an <i>immediate opcode</i> as its second argument.
-It copies that
-opcode (which must be either <code>NOP</code> or <code>STOP</code>) to the offset, and
-uses the number found in the length register as a multiplier.</p>
-
-<p>Execution starts at the beginning of the program (well, naturally,)
-and stops when the program counter tries to execute beyond the last
-instruction specified in the program.  A <code>STOP</code> instruction
-will also serve this purpose, if you choose to use it.</p>
-
-<h3>Huh.</h3>
-
-<p>So there you go.  Myth debunked.  You don't need <code>if</code>
-or <code>goto</code> or <code>while</code> or anything so fancy to be
-Turing-Complete.  All you need is to be able to
-<code>memcpy</code> yourself!  Don't that just take the cake?</p>
-
-</body></html>