Commits

Anonymous committed f40dd60 Draft

Initial import of Shelta version 1.0 revision 1999.1223 sources.

Comments (0)

Files changed (21)

+@echo off
+REM BOOTSTRP.BAT v2002.1208 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+REM Builds the bootstrapped versions (S & S2) of the Shelta compiler.
+@echo on
+call bin\shelta 86 prj\sheltas
+copy prj\sheltas.com bin\sheltas.com
+call bin\shelta s prj\sheltas
+copy prj\sheltas.com bin\sheltas2.com
+call bin\shelta s2 prj\sheltas
+diff prj\sheltas.com bin\sheltas2.com
+del prj\sheltas.com

Binary file added.

+@echo off
+REM SHELTA.BAT v2002.1208 (c)2002 Cat's-Eye Technologies.
+REM A 'make'-like utility for Shelta compilers, as an MS-DOS batch.
+
+REM -- Change the following lines to tailor what libraries are
+REM -- included by default.  See readme.txt
+type lib\8086\8086.she >s
+type lib\8086\gupi.she >>s
+type lib\8086\dos.she >>s
+type lib\8086\string.she >>s
+type lib\gupi\linklist.she >>s
+
+REM -- This section builds the source file, always called 'S'.
+if not exist %2.she echo Can't find project file %2.she!
+if exist %3.she type %3.she >>s
+if exist %4.she type %4.she >>s
+if exist %5.she type %5.she >>s
+if exist %6.she type %6.she >>s
+if exist %7.she type %7.she >>s
+if exist %8.she type %8.she >>s
+if exist %9.she type %9.she >>s
+if exist %2.she type %2.she >>s
+type null.txt >>s
+
+bin\shelta%1.com <s > %2.com
+
+if errorlevel 32 echo Source file could not be opened.
+if errorlevel 16 echo Error - Unknown identifier in source file.
+del s

Binary file added.

Binary file added.

Binary file added.

+Making the Snake Eat its Tail: Bootstrapping
+--------------------------------------------
+Oct 20 1999, Chris Pressey, Cat's Eye Technologies.
+
+What is bootstrapping?
+----------------------
+
+Bootstrapping is the act of implementing a compiler for a language in
+that same language, or a subset of it.  It is a well-understood aspect
+of compilation and the translation of compilers from one machine onto
+a different machine.
+
+Bootstrapping is a fairly esoteric discipline, however, partly because
+there's little need to do it more than once for any given compiler and
+any given machine, but also because at a basic level, bootstrapping is
+somewhat difficult to understand.
+
+How, for example, do you write a compiler that compiles itself, without
+first having the compiler???  Sounds more than a little bit like the
+"Which came first, the chicken or the egg" paradox.
+
+The term 'bootstrap' itself comes from the whimsical idea that if you
+were to bend over and tug at the straps on your own boots, you could
+lift yourself off the ground.
+
+So what's the trick to making a compiler levitate?
+--------------------------------------------------
+
+Well, first of all let me put your mind at rest - there's no paradox
+or magic or anything else spooky involved, although it can feel that
+way sometimes.  There are in fact two realistic options for
+bootstrapping:
+
+- Write (on paper) the compiler in the language which it compiles,
+  then hand-translate (i.e. manually compile) it to assembly or
+  machine language.  This approach has been the one taken for the
+  first compilers for both Pascal and LISP.
+
+- First write a compiler for the language in another, already-available
+  language, such as assembly language, or C.  Then re-write that compiler
+  in the language which it compiles.  This is the approach many
+  bootstraps have taken.
+
+How was Shelta bootstrapped?
+----------------------------
+
+I took the second approach.
+
+First I wrote the Shelta compiler SHELTA86.COM in assembly language
+(SHELTA86.ASM) using Turbo Assembler 3.1.
+
+             +----------------+
+SHELTA86.ASM | TASM ---> 8086 | SHELTA86.COM
+             +----+      +----+
+                  | 8086 |
+                  +------+
+                  TASM.EXE
+
+This is a 'tee diagram' as is commonly used by people who have to
+do these sorts of things... it's pretty simple to understand.
+
+The filename on the left is the input into the 'tee', the filename on
+the right is the output of the 'tee'.  The filename on the bottom is
+the tool which is translating the input into the output.  The formats
+listed inside the tee are the languages each of the files is written in.
+
+Because the output of this tee is a compiler, however, it can exist as
+a tee in it's own right:
+
+                        +-----------------+
+                        | Shelta --> 8086 |
+             +----------+-----+      +----+
+SHELTA86.ASM | TASM ---> 8086 | 8086 |
+             +----+      +----+------+
+                  | 8086 |  SHELTA86.COM
+                  +------+
+                  TASM.EXE
+
+So I re-wrote SHELTA86.ASM in Shelta/GUPI, calling it SHELTAS.SHE.
+
+                        +-----------------+
+            SHELTAS.SHE | Shelta --> 8086 | SHELTAS.COM
+             +----------+-----+      +----+
+SHELTA86.ASM | TASM ---> 8086 | 8086 |
+             +----+      +----+------+
+                  | 8086 |  SHELTA86.COM
+                  +------+
+                  TASM.EXE
+
+Lo and behold!  A Shelta compiler written in Shelta.  But that's not
+the whole story - at this point the bootstraps have been pulled taut,
+but there is one more tug that must be made to actually get levitating.
+The compiler SHELTAS.COM must prove it's worth, meeting it's maker
+so to speak:
+
+                                    +-----------------+
+                        SHELTAS.SHE | Shelta --> 8086 | SHELTAS2.COM
+                        +-----------+-----+      +----+
+            SHELTAS.SHE | Shelta --> 8086 | 8086 | 
+             +----------+-----+      +----+------+
+SHELTA86.ASM | TASM ---> 8086 | 8086 |   SHELTAS.COM
+             +----+      +----+------+
+                  | 8086 |  SHELTA86.COM
+                  +------+
+                  TASM.EXE
+
+Now, because of some subtle differences in SHELTA86.ASM and SHELTAS.SHE
+(the assembly language version does no optimization), the sizes and
+contents of all three of these Shelta compilers differ slightly.  But
+if the process was carried on one step further, the resultant compiler
+would be the same as SHELTAS2.COM.  The following might help clarify why
+this is:
+
+                        SHELTAS.SHE +-----------------+
+                         Optimizing | Shelta --> 8086 | SHELTAS2.COM
+            SHELTAS.SHE +-----------+-----+      +----+  Optimizing
+             Optimizing | Shelta --> 8086 | 8086 |       Optimized
+             +----------+-----+      +----+------+
+SHELTA86.ASM | TASM ---> 8086 | 8086 |   SHELTAS.COM
+NonOptimizing+----+      +----+------+    Optimizing
+Hand-Optimized    | 8086 |  SHELTA86.COM  Non-Optimized
+                  +------+ Non-Optimizing
+                           Hand-Optimized
+
+OK, but why did you choose to do this, anyway?
+----------------------------------------------
+
+Well, there was certainly no reason to.  I was not moving Shelta from
+one machine to another, nor was I treating SHELTA86.ASM as a quick
+hack which would be discarded once an optimizing compiler could be
+bootstrapped.
+
+On the other hand, there was no reason *not* to, so...
+
+I did it mainly to say that I could.  Not everyone can design a language,
+write a compiler for it in the form of a 1/2-kbyte COM file, then
+bootstrap it.  I'm not sure I can say it was the hardest thing I've
+ever done, but it was difficult enough.
+
+Plus, well, it's the kind of freaky self-referential thing I've always
+been interested in.  A compiler written in the language which it
+compiles, which in the end appears to have been compiled by itself.
+
+In the preceding section I may have made what I did seem like a walk
+in the park, but it wasn't.  A large portion of the time was spent on:
+
+ - fixing bugs in SHELTA86.ASM
+ - building GUPI so that Shelta could be powerful enough to bootstrap
+   meaningfully (I could have just included SHELTA86.COM entirely as
+   inline assembly, but that would kind of defeat the purpose)
+ - fixing bugs in SHELTAS.SHE
+ - testing SHELTAS2.COM - more concentration than time, actually.
+   When you have five Shelta compilers, two in source form
+   (Turbo Assembler and Shelta) and three in executable form, you're
+   bound to get a little disoriented from having to keep track of
+   their interdependencies.
+
+I can also offer the following piece of advice to anyone who is going
+to be trying something similar: if you've already squashed down your
+first compiler's source code in order to (say) claim bragging rights on
+having built an 512-byte compiler, DO NOT attempt to simply translate
+the optimized assembly code into another language.  Rewrite it instead.
+Especially for a program of this size.  Initially trying to do a
+literal translation from SHELTA86.ASM to SHELTAS.SHE was easily the
+biggest mistake I made.
+
+Where can I find further information on bootstrapping?
+------------------------------------------------------
+
+Two books are of note: the notorious "Dragon" book by Aho, Sethi and
+Ullman gives it a brief once-over; "Compilers and Compiler Generators"
+by Terry gives it a more thorough and readable treatment.
+
+Happy levitating!
+
+Chris Pressey, Oct 20 1999
+Cat's Eye Technologies, Winnipeg, Manitoba, Canada
+                       Shelta <Maentwrog Mk IV>
+                       ------------------------
+              * * * NEAR-BETA VERSION v1999.12.23 * * *
+
+Shelta <Maentwrog Mk IV> NEAR-BETA (c)1999 Chris Pressey, Cat's-Eye Technologies.
+All rights reserved.  No warranty for suitability of any kind.
+Consider yourself lucky if your head doesn't blow up.
+This product is freely redistributable provided that all distributed copies
+include the entire original unmodified archive.
+
+What is Shelta <Maentwrog Mk IV>?
+---------------------------------
+
+Shelta <Maentwog Mk IV> (which I'll generally just call Shelta during
+the scope of it's documentation) is a set of interrelated software
+systems:
+
+- Shelta the Language - somewhere between FORTH, FALSE, and Assembler
+- SHELTA86.COM the Tool - a Shelta compiler written in 8086 asm
+- SHELTA.BAT the Compiler - organizer ('make') for Shelta files & libraries
+- SHELTAS.COM and SHELTAS2.COM - Shelta compilers written in Shelta
+- The GUPI Protocol - a standardized library of Shelta definitions
+
+What's the history of Shelta?
+-----------------------------
+
+My first programming language ever was called Maentwrog, a term taken
+from that wholly remarkable book, _The Meaning of Liff_, by Douglas
+Adams.
+
+Maentwrog sucked.  But, it worked, kind of.  It was interpreted, but
+as I recall, it wasn't even tokenized... making the interpreter more
+than a little slow.  It was basically a subset of FORTH - not much
+special there.
+
+My second programming language ever was based on Maentwrog, and it
+spawned a big hit called Befunge.  Befunge left Maentwrog in the dust,
+because there WAS much special there - Befunge is 2D, and that's
+trippy.  If you haven't tried to program in Befunge yet... try it!
+
+However, I've always felt that I fell somewhat short of the mark I
+was trying to make with Befunge-93.  After all, it was inspired by
+FALSE and Brainf*ck, but unlike either of them, it was not a small
+machine-dependent compiler.  It was a big, portable interpreter.
+(In 1998 I rewrote the interpreter in assembly language to make a
+Befunge-93 interpreter that fit into 2K.  But it's just not the same.)
+The urge to write a tiny compiler has been gnawing at me for the
+past few years.
+
+As such, Maentwrog has not gone totally forgotten.  Over the years I've
+made a few attempts at reworking the Maentwrog language, with little
+success, until now.  The main thing holding back Maentwrog for so
+long was it's lack of strict design principles.  Only now has the
+subconscious philosophy of Maentwrog evolved to a point where it means
+anything.  The result is Shelta.
+
+What is Shelta's Etymology?
+---------------------------
+
+_The Oxford Dictionary of Current English_, 1996, describes Shelta as an
+"ancient hybrid secret language used by Irish tinkers, Gypsies, etc."
+Shelta <Maentwrog Mk IV> is targeted at a similar present-day audience.
+Would you sometimes rather consider yourself a tinker (or a Gypsy) than
+a computer programmer?  Then Shelta may just be for you.
+
+What is Shelta's Philosophy?
+----------------------------
+
+Shelta's philosophy is one of simplicity of translation.  Shelta is
+easy to implement in assembly language.  Shelta is nearly as low-level
+as assembly language.  Very small Shelta compilers can be built.
+
+Shelta is also relatively easy to bootstrap - that is, it's not that
+difficult to implement a Shelta compiler in Shelta itself.  In fact,
+that (along with writing a ridiculously small compiler) was my main
+motivation for designing Shelta and building SHELTA86.COM.  For more
+information on the bootstrapped Shelta compilers and bootstrapping in
+general, see the file bootstrp.txt.
+
+In and of itself, Shelta has no actual functional semantics: only
+structural ones.  It relies on a either library of functions (such as
+GUPI, described below) or inline machine language in order to be
+considered Turing-Complete.  That is, not unlike the ancient Shelta
+language, Shelta <Maentwrog Mk IV> is hybridized: the actual programming
+is usually done in Shelta/GUPI.
+
+What are Shelta's Influences?
+-----------------------------
+
+Shelta is influenced largely by the wholly remarkable programming
+language FALSE, by Wouter van Oortmerssen - "FORTH with lambda
+functions".  However, it is lower-level than FALSE.  It is more like
+FORTH in some ways - for example, multicharacter user-defined names
+can be used to name unlimited variables, not just the a-z in FALSE.
+Lastly, it is unlike FORTH and more like Assembler in that there is
+no FORTH-like environment nor any fixed-size blocks of text as
+input files.
+
+What is Shelta's Syntax?
+------------------------
+
+Tokens are delimited by whitespace - any whitespace and as much of it
+as you like, but as long as two non-whitespace characters are adjacent,
+they are considered part of the same token.  Shelta's idea of
+whitespace is, in ASCII, everything from #32 (space) down to #1 (^A).
+(#0 (NUL) is considered synonymous with an end-of-file condition.)
+
+The exception to the above rule is a comment block, which begins
+(anywhere) with a ";" character and ends at the next ";" character.
+This can occur even in the middle of a token, so "HE; foo ;LLO" is
+taken to be the token "HELLO".
+
+User-defined tokens - depicted with "Name" in the following table -
+can contain any non-whitespace characters, and can begin with any
+non-whitespace characters except for "[", "]", "\", '^', "_" and "`".
+(This includes digits - "1" by itself is a name, not a number.)
+
+What are the recognized tokens of Shelta?
+-----------------------------------------
+
+	[	Begin block.
+*	]	End block, push pointer.
+	]=Name	End block, name pointer.
+	]:Name	End block, name pointer to compile-time-only block.
+*	]Name	End block, push named pointer.
+	^Name	Push pointer to previously named block.
+	_^Name	Insert pointer to previously named block.
+	Name	Insert contents of previously named block.
+
+*	`ABC	Insert string.
+	_123	Insert decimal byte.
+	__1234	Insert decimal word.
+	\123	Push decimal word.
+
+* = not available in SHELTA86.
+
+What are some common syntactic idioms in Shelta?
+------------------------------------------------
+
+	[ `ABC ]		Push pointer to string.
+	[ _5 _5 _5 _3 ]		Push pointer to byte array.
+	[ __1234 __9999 ]	Push pointer to word array.
+	[ _5 _5 ]=my-data	Name a byte array.
+        [ _^my-data ]=my-refs   Name an array of references to data.
+	_88			Insert anonymous inline machine code.
+	[ _88 ]:xyz 		Define xyz as inline machine code.
+	xyz			Insert named inline machine code.
+        [ bar baz ]:foo		Declare 'foo' as an inline proc
+        foo			Insert 'foo' as an inline proc.
+        [ ]=bar                 Define named label.
+
+Where do you use : instead of = after ]?
+----------------------------------------
+
+Originally, Shelta did not distinguish between blocks used as
+updatable stores, subroutines, or templates for inlined instructions.
+As such, it would include all of them into the resulting executable,
+even the blocks only used at compile-time to define inline instructions.
+
+By using : instead of = after ], the Shelta compiler will treat the
+block as containing information which is only used at compile-time.
+This is essentially a contract between the programmer and the compiler;
+the programmer promises not to expect the ^Name or _^Name syntax to work
+on the block, and the compiler ensures the block does not show up
+extraneously in the resulting executable.
+
+What are some of the quirks of Shelta?
+--------------------------------------
+
+Shelta's lambda syntax is not uniform.  On the top level, an empty
+block such as this:
+  [ ]=label
+is not necessarily defined to actually work.  It is only defined to
+produce sensible results when nested within another block like this:
+  [ [ ]=label foo ]=block
+Also, this is NOT the same thing as saying:
+  [ [ foo ]=block1 bar ]=block2
+This linearizes block1 out of block2, almost as if you had said
+  [ foo ]=block1 [ bar ]=block2
+Except that the identifier block1 is 'supposed' to be local to block2
+(it ISN'T, but it might be good programming practice to treat it that
+way anyway! :-)
+
+To make things even worse, nesting more than two levels deep like so
+  [ foo [ bar [ baz ] quuz ] phlef ]
+probably doesn't do what you expect.  Feel free to experiment, though.
+
+Shelta does not have or use forward references.  That seems to be no
+problem, with the lambda-like declarations, but it can often force you
+to write weird and awkwardly structured code.  If you need to refer to
+the current block from within it, you can always name a block twice:
+
+  [ foo ^this bar ]=this              ; won't work! ;
+  [ [ ]=-this foo ^-this bar ]=this   ; works! ^this == ^-this ;
+
+What is SHELTA86.COM?
+---------------------
+
+The Shelta compiler is implemented in 8086 assembly language and assembles
+to a tiny (LESS THAN HALF A KILOBYTE! :-) executable program.  There are
+several restrictions on the program in order to trim fat:
+
+- Input file goes in standard input, .COM file comes out standard output.
+  File should end in a ^@ (NUL) character to indicate EOF.
+  This NUL should be preceded by whitespace (otherwise it'll form
+  part of a token - you don't want that! :-)
+- If a file error occurs, error code 32 is returned.
+- If an undefined token is found, error code 16 is returned.
+- The forms ] (End block push pointer,) ]Name (End block push named
+  pointer,) and `xyz (Insert String) are not supported.  It is not
+  difficult to work around these by explicitly naming and pushing
+  blocks and using ASCII decimal sequences for strings.  These
+  inequities could even be addressed by a simple pre-processor.
+
+What is SHELTA.BAT?
+-------------------
+
+SHELTA.BAT allows one to harness a Shelta compiler such as SHELTA86.COM,
+without having to directly put up with it's silly interface.
+
+Usage:
+	SHELTA compiler project-file {library-files...}
+
+'compiler' is one of: 86 (the assembly-language compiler,) S (the
+compiler written in Shelta and compiled with 86), or S2 (the
+compiler written in Shelta and compiled with S.)  (See the file
+bootstrp.txt for more information on the Shelta compilers written
+in Shelta.)
+
+You don't need to append '.she' to project-file or any library-file
+you choose to include, SHELTA will do that for you and will
+automatically name the output 'project-file.COM'.
+
+SHELTA should support up to nine arguments, so you can specify seven
+library files on the command line (there's no 'include' directive in
+Shelta itself.)
+
+As an example of how to use SHELTA, here's how to build and test one
+of the example Shelta/GUPI programs, "Hello, world!":
+
+(Updated Dec 8 2002 to reflect new directory structure:)
+
+  cd shelta-<<version>>
+  bin\shelta s2 prj\hello
+  prj\hello
+
+How can I specify what libraries for SHELTA.BAT to use by default?
+------------------------------------------------------------------
+
+By default SHELTA.BAT includes the following libraries:
+
+  8086\8086.she       8086 subset definition
+  8086\gupi.she       General GUPI library (defined in 8086 subset)
+  8086\string.she     GUPI string functions (defined in 8086 subset)
+  8086\dos.she        DOS-dependent GUPI I/O (defined in 8086 subset)
+  gupi\linklist.she   Linked list library (defined in GUPI)
+
+You can edit SHELTA.BAT to change which libraries it uses by default.
+(It is just a .BAT file after all.)
+
+8086.she: One could presumably replace these inline instructions with
+equivalent instructions for another relative-addressing processor,
+change a few lines of SHELTA86.ASM, and voila!  You could compile
+Shelta to some other CPU.  It'd be a cute trick...
+
+gupi.she: While Shelta comes with GUPI, you don't need to use GUPI
+with Shelta!  You can comment out gupi.she and completely redefine the
+semtantics of your Shelta.  For example, you could use a very small
+set of instructions (a tar pit) and use Shelta to compile languages
+very similar to Brainf*ck, Malbolge, etc.
+
+dos.she: You can replace the dos.she library with the bios.she library;
+it does the same thing but goes directly through the BIOS instead, and
+you can write code that will work without DOS loaded (so you could even
+build your own OS or embedded controller code with Shelta! ;-)
+
+What is GUPI?
+-------------
+
+GUPI stands for Generic Utilitarian Programming Interface.  GUPI
+is a set of Shelta definitions that acts as a standard library.
+
+(Fact is, I'm not a big fan of how GUPI turned out; it is a
+contrived and contingent beast, rather than the beautifully
+designed and conceptually airtight scheme I had hoped for.
+But I figure that if it was good enough to get me this far, it's
+worth keeping around, and the hybrid design of Shelta makes it
+easy to swap it for something else at a later time, anyway.)
+
+The GUPI semantics as presented here work on a stack of word values
+in a FORTH-like manner.  Note that GUPI is not yet well documented.
+Nor is it guaranteed not to change (although it looks unlikely;
+any major change will warrant it's own library; "GUPII" perhaps? :-)
+
+What are some of the naming conventions of GUPI?
+------------------------------------------------
+
+Generally speaking...
+The suffix b indicates 'byte'.
+The suffix c indicates 'character'.
+The suffix if indicates 'decision on a boolean'.
+The suffix s indicates 'string with length' (block).
+The suffix w indicates 'word' (normally 16-bit).
+The suffix z indicates 'null-terminated string' (ASCIIZ).
+
+Lack of any suffix usually indicates 'any type'.
+
+What are the basic GUPI commands?
+---------------------------------
+
+	pop		pop and discard top stack element
+	dup		duplicate top stack element
+	swap		pop a, pop b, push a, push b
+
+	to		pop pointer, machine unary jump to pointer
+	do		pop pointer, machine sub call pointer
+        toif		pop pointer, pop boolean, unary jump if nonzero
+	doif		pop pointer, pop boolean, sub call if nonzero
+	begin		pop return pointer and push onto call stack
+	end		push return pointer from call stack
+
+ begin, end, and do/doif lead to the following GUPI idiom:
+   [ begin baz end ]=bar	Declare 'bar' as a subroutine.
+   ^bar do			Call 'bar' as a subroutine call.
+
+What are the memory-access commands?
+------------------------------------
+
+	getb		pop pointer, push byte data at pointer
+	putb		pop pointer, pop byte value, write at pointer
+	getw		pop pointer, push word data at pointer
+	putw		pop pointer, pop word value, write at pointer
+
+What are the arithmetic and logic commands?
+-------------------------------------------
+
+	++		pop a, push a + 1
+	--		pop a, push a - 1
+	**		pop a, push a << 1
+	//		pop a, push a >> 1
+	<<		pop a, pop b, push b << a
+	>>		pop a, pop b, push b >> a
+	+		pop a, pop b, push b + a
+	-		pop a, pop b, push b - a
+	*		pop a, pop b, push b * a
+	/		pop a, pop b, push b / a
+	%		pop a, pop b, push b mod a
+*1	/%		pop a, pop b, push b / a, push b mod a
+	!		pop a, push binary not a
+	zero		pop a, push 1 if a = 0, push 0 otherwise
+	&		pop a, pop b, push a binary and b
+	|		pop a, pop b, push a binary or b
+	~		pop a, pop b, push a binary xor b
+
+*1.  The algorithm commonly used for binary division actually computes
+  the results of both division and modulo (remainder for a > 0).  If
+  both results are desired by the program, using /% is usually nearly
+  twice as efficient as using / and % seperately.
+
+Indirect arithmetic?
+--------------------
+
+	@++		pop pointer, increment word at pointer
+	@--		pop pointer, decrement word at pointer
+
+How does GUPI interface with the operating system?
+--------------------------------------------------
+
+	outs		pop length, pop pointer, send bytes to stdout
+	outc		pop word, send low byte to stdout
+	inc		wait for input on stdin, push character read
+	qinc		quietly wait for input on stdin, push character
+	chkin		immediately return input status (is a char waiting?)
+	flin		flush all unread input
+	halt		pop a, stop program and return to operating system
+			with error code 'a'
+
+And dynamic memory?
+-------------------
+
+	malloc		pop size, push ptr to memory of length size
+	mfree		pop ptr, reset heap ptr to ptr
+		(Note that mfree will free ALL pointers that were
+		allocated with malloc since the pointer that is being
+		freed was allocated.  It's good for local
+		linked lists and such, but be careful!)
+
+What is "Portable Shelta/GUPI"?
+-------------------------------
+
+The short answer is, "Portable Shelta/GUPI" is the subset of the
+union of the Shelta and GUPI languages where, through patience
+and restraint - i.e. discipline - the Shelta/GUPI programmer
+does not use any machine-dependent or self-modifying code, and
+restricts themselves to the GUPI functions that do likewise or
+are specified precisely and abstractly enough to be ported,
+that is, re-written in some other machine or VM bytecode.
+
+Where can I get updates on Shelta's condition?
+----------------------------------------------
+
+Shelta's official web site is located at:
+
+  http://www.catseye.mb.ca/esoteric/shelta/
+
+Happy tinkering!
+
+Chris Pressey, Dec 23 1999
+Cat's-Eye Technologies, Winnipeg, Manitoba, Canada
+;
+  99.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
+  The song "Ninety-Nine Bottles of Beer" implemented in Shelta/GUPI.
+;
+
+[ _32 `bottles _32 `of _32 `beer _32 `on _32 `the _32 `wall, _13 _10 ]=L1
+[ _32 `bottles _32 `of _32 `beer, _13 _10 ]=L2
+[ `Take _32 `one _32 `down, _32 `pass _32 `it _32 `around, _13 _10 ]=L3
+[ _32 `bottles _32 `of _32 `beer _32 `on _32 `the _32 `wall. _13 _10 _13 _10 ]=L4
+
+[ `9 ]=bh [ `9 ]=bl
+[ begin ^bh getb outc ^bl getb outc end ]=btls
+[ begin ^bh getb \1 - ^bh putb \57 ^bl putb end ]=digit
+
+[ [ ]=iloop
+
+  ^btls do
+  ^L1 \31 outs
+
+  ^btls do
+  ^L2 \19 outs
+
+  ^L3 \32 outs
+
+  ^bl getb \1 - ^bl putb
+  ^bl getb \47 - zero ^digit doif
+
+  ^btls do
+  ^L4 \33 outs
+
+  ^bh getb \47 - ^iloop toif
+
+  \0 halt
+] to
+;
+  demo.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
+  A demonstration of some of the basic features of Shelta and GUPI.
+;
+
+[
+  [ ]=hw `Hello, _32 `world!     ; an empty block denotes a label ;
+  [ ]=eol _13 _10
+]=hello
+
+[ _0 ]=i
+[ _0 ]=pad
+[ __0 ]=hptr
+
+[
+  begin
+  \1024 malloc ^hptr putw
+  [ ]=wloop
+    ^i getb ^hptr getw ^i getb + putb
+    ^i getb ++ ^i putb
+    ^i getb ^wloop toif
+  ^hptr getw \32 + \223 outs
+  end
+] do
+
+^hello \15 outs
+
+^hw \12 outs
+^eol \2 outs
+
+^hello getb outc
+
+^hello \1 + getb outc
+
+\65 ^hello putb ^hello \15 outs
+
+\1000 \8 / outc
+\8 \8 * ++ outc
+\8 \9 * ++ outc
+\9 \9 * -- outc
+
+flin
+[
+  [ ]=loop
+  inc outc ^loop to ;forever!;
+] to
+;
+  hello.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
+  The ubiquitous greeting message, implemented in Shelta/GUPI.
+;
+[ `Hello, _32 `world! _13 _10 ] \15 outs \0 halt
+;
+  sheltas.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
+  A bootstrappable Shelta compiler written in Shelta/GUPI.
+;
+
+[ __0 ]=safestart
+[ __0 ]=namestart
+
+[ __0 ]=codeba
+[ __0 ]=stacba
+[ __0 ]=safeba
+[ __0 ]=macrba
+[ __0 ]=tokenba
+
+[ __0 ]=symthead
+[ __0 ]=codeh
+[ __0 ]=stach
+[ __0 ]=safeh
+[ __0 ]=macrh
+[ __0 ]=tokenh
+
+[ begin \16 halt end ]=badtok
+[ begin dupz end ]=fndupz
+
+;--------------------------------------;
+
+[ __0 ]=newn
+[ ; addr dlen strz -> void ;
+  begin
+  ^fndupz do
+ 
+  ; link up the new node ;
+  ^symthead getw \6 ll-node dup ^newn putw ll-link
+  ^newn getw ^symthead putw
+
+  ; addr dlen strz ;
+  ^newn getw ll-dptr putw
+  ^newn getw ll-dptr \2 + putw
+  ^newn getw ll-dptr \4 + putw
+  
+  end
+]=InsertSymbol
+
+[ __0 ]=lui
+[ __0 ]=luitok
+[ \0 end ]=luno
+[ ^lui getw ll-dptr \4 + getw
+  ^lui getw ll-dptr \2 + getw end ]=luyes
+[ ; strz -> dlen addr, that is, addr is pushed first;
+  begin
+  ^luitok putw
+  ^symthead getw ^lui putw
+
+  [ ]=luloop
+    ^lui getw zero ^luno toif
+    ^lui getw ll-dptr getw ^luitok getw eqzz ^luyes toif
+    ^lui getw ll-next ^lui putw
+    ^luloop to
+
+]=LookupSymbol
+
+;--------------------------------------;
+
+[ __0 ]=ddtoken   ; contains pointer into token where to decipher ;
+[ __0 ]=ddvalue   ; contains running tally of the value ;
+[
+  begin
+  ^tokenba getw + ^ddtoken putw
+  \0 ^ddvalue putw
+  [ ]=ddLoop
+    ^ddvalue getw \10 *
+    ^ddtoken getw getb \48 - +
+    ^ddvalue putw
+    
+    ^ddtoken @++
+    ^ddtoken getw getb \47 > ^ddLoop toif
+
+  ^ddvalue getw
+  end
+]=DecipherDecimal
+
+;--------------------------------------;
+
+[
+  begin
+  ^codeh getw ++ putw
+  \184 ^codeh getw putb
+  \80 ^codeh getw \3 + putb
+  ^codeh getw \4 + ^codeh putw
+  end
+]=WritePush
+
+[
+  begin
+  \1 ^DecipherDecimal do ^WritePush do
+  end
+]=PushWord
+
+;--------------------------------------;
+
+[
+  ^tokenba getw \2 + ^LookupSymbol do pop
+  dup zero ^badtok toif
+  ^safeba getw - \260 +
+  ^codeh getw putw
+  ^codeh getw \2 + ^codeh putw
+  end
+]=LiteralSymbol
+[
+  \2 ^DecipherDecimal do ^codeh getw putw
+  ^codeh getw \2 + ^codeh putw
+  end
+]=LiteralWord
+[
+  begin
+
+  ^tokenba getw ++ getb \95 - zero ^LiteralWord toif
+  ^tokenba getw ++ getb \94 - zero ^LiteralSymbol toif
+  \1 ^DecipherDecimal do 
+  ^codeh getw putb
+  ^codeh @++
+  end
+]=LiteralByte
+
+;--------------------------------------;
+
+[
+  begin
+  ^tokenba getw ++ ^LookupSymbol do pop
+  dup zero ^badtok toif
+  ^safeba getw - \260 + ^WritePush do
+  end
+]=PushPointer
+
+;--------------------------------------;
+
+[ __0 ]=strct
+[
+  begin
+  \1 ^strct putw
+  [ ]=strLoop
+
+    ^tokenba getw ^strct getw + getb
+
+    ^codeh getw putb
+
+    ^codeh @++
+    ^strct @++
+
+    ^tokenba getw ^strct getw + getb ^strLoop toif
+
+  end
+]=String
+
+;--------------------------------------;
+
+[
+  begin
+  ^codeh getw
+  ^stach getw putw
+  ^stach getw \2 + ^stach putw
+  end
+]=BeginBlock
+
+[ __0 ]=ebtokptr
+[ __0 ]=ebtoklen
+[ __0 ]=ebdatlen
+[ __0 ]=origcodeh
+[
+  begin
+  ; adjust namestart ... possibly the weirdest Shelta-ism ;
+  ^namestart getw ^origcodeh getw + ^stach getw \2 - getw - ^namestart putw
+  end
+]=AdjustName
+[ __0 ]=nei ; a shared counter the for next two subroutines ;
+[
+  ^macrh getw ^namestart putw
+
+  ; copy everything from origcodeh to codeh into the macro area ;
+
+  ^origcodeh getw ^nei putw
+
+  [ ]=mloop
+
+    ^nei getw getb ^macrh getw putb
+    ^nei @++
+    ^macrh @++
+    ^nei getw ^codeh getw - ^mloop toif
+
+  ; change codeh back to origcodeh ;
+
+  ^origcodeh getw ^codeh putw
+
+  end
+]=MacroInstead
+[
+  begin
+  ^tokenba getw ++ getb \58 - zero ^MacroInstead toif
+
+  ; copy everything from origcodeh to codeh into a safe area ;
+
+  ^origcodeh getw ^nei putw
+
+  [ ]=neloop
+
+    ^nei getw getb ^safeh getw putb
+    ^nei @++
+    ^safeh @++
+    ^nei getw ^codeh getw - ^neloop toif
+
+  ; change codeh back to origcodeh ;
+
+  ^origcodeh getw ^codeh putw
+  end
+]=NotEmpty
+[ begin   ^tokenba getw \2 + ^ebtokptr putw end ]=incebtokptr
+[
+  ; insert name into dictionary ;
+  ^namestart getw ^ebdatlen getw ^ebtokptr getw   ^InsertSymbol do
+  end
+]=NameIt
+[
+  begin
+
+  ^tokenba getw ++ ^ebtokptr putw
+  ^tokenba getw ++ getb \58 - zero ^incebtokptr doif
+  ^tokenba getw ++ getb \61 - zero ^incebtokptr doif
+
+  ^safeh getw dup ^safestart putw ^namestart putw ; track starts ;
+
+  ^ebtokptr getw lenz ^ebtoklen putw
+
+  ^stach getw \2 - ^stach putw
+  ^stach getw getw ^origcodeh putw     ; get original code head ;
+
+  ^codeh getw ^origcodeh getw - ^ebdatlen putw
+
+  ^stach getw ^stacba getw - ^AdjustName doif
+
+  ^ebdatlen getw \0 > ^NotEmpty doif
+
+  ; write push instruction if '=' or ':' not used ;
+
+  ^tokenba getw ++ getb \58 - zero ^NameIt toif
+  ^tokenba getw ++ getb \61 - zero ^NameIt toif
+
+  \184 ^codeh getw putb
+  \80  ^codeh getw \3 + putb
+  ^safestart getw ^safeba getw \260 - - ^codeh getw ++ putw
+
+  ^codeh getw \4 + ^codeh putw
+
+  ^tokenba getw ++ getb ^NameIt toif
+  end
+]=EndBlock
+
+;--------------------------------------;
+
+[ __0 ]=urctr
+[ __0 ]=urlen
+[ __0 ]=uraddr
+[
+  ^codeh getw -- -- ^codeh putw
+  ; ^codeh \4 \2 fwrite ^crlf \2 \2 fwrite ;
+  end
+]=wipeit
+[
+  ^codeh getw \2 - getb \80 - zero ^wipeit toif end
+]=peep
+[
+  ^codeh getw -- getb \88 - zero ^peep toif end
+]=peepok
+[
+  begin
+  ^codeh getw ^codeba getw - ^peepok toif end
+]=clean
+[
+  [ ]=urloop
+    ^uraddr getw getb ^codeh getw putb
+    ^uraddr @++
+    ^codeh @++
+    ^urlen @--
+    ^urctr @++
+
+    ^urctr getw -- zero ^clean doif
+
+    ^urlen getw ^urloop toif
+  end
+]=curloop
+[
+  begin
+  \0 ^urctr putw ^tokenba getw ^LookupSymbol do ^urlen putw
+  dup zero ^badtok toif
+  ^uraddr putw
+
+  ; copy urlen bytes from uraddr to codeh ;
+
+  ; 1999.10.14 peephole optimization commented out.
+    it crashes and it's not strictly necessary.  someday, perhaps... ;
+
+  ^urlen getw ^curloop toif
+
+  end
+]=Unroll
+
+;--------------------------------------;
+
+[ end ]=goodc  ; char was dupped and is on stack ;
+[
+  begin
+  [ ]=floop
+    qinc dup \59 - ^goodc toif pop ; return good char if not semicolon ;
+    [ ]=cloop
+      qinc \59 - zero ^floop toif
+      ^cloop to
+  end
+]=scanc
+
+[ _0 ]=inbyte
+[ _0 ]=eoff
+[ \1 ^eoff putb end ]=goteof
+[
+  begin
+  ^tokenba getw ^tokenh putw
+  [ ]=scanloop
+    ^scanc do ^inbyte putb
+    ^inbyte getb zero ^goteof toif
+    \33 ^inbyte getb > ^scanloop toif
+
+    [ ]=scisloop
+    ^inbyte getb ^tokenh getw putb ^tokenh @++   ;write char to token;
+
+    ^scanc do ^inbyte putb 
+    ^inbyte getb zero ^goteof toif
+    ^inbyte getb \32 > ^scisloop toif
+
+  \0 ^tokenh getw putb
+  end
+]=scantok
+
+; --- startup --- get dynamic memory off of heap --- ;
+
+\16384 malloc dup ^safeba putw \2 + ^safeh putw
+\4096 malloc dup ^macrba putw ^macrh putw
+\4096 malloc dup ^codeba putw ^codeh putw
+\256 malloc dup ^stacba putw ^stach putw
+\128 malloc dup ^tokenba putw ^tokenh putw
+
+[ 
+  ; write output file ;
+
+  ; put in a jump over the safe area ;
+
+  ^safeh getw ^safeba getw - ++
+
+  \233 outc
+  dup \255 & outc \8 >> \255 & outc
+  \144 outc
+
+  ; make the first word of the safe area an offset ;
+  ; to just past the last word of the code ;
+
+  ^safeh getw ^safeba getw - ^codeh getw + ^codeba getw \260 - - ^safeba getw putw
+
+  ^safeba getw ^safeh getw ^safeba getw - outs
+  ^codeba getw ^codeh getw ^codeba getw - outs
+  \0 halt
+]=tail
+[ [ ]=main
+  ^scantok do
+  ^eoff getb ^tail toif
+  ^tokenba getw getb \91 - \5 > ^Unroll doif
+  ^tokenba getw getb \91 - zero ^BeginBlock doif
+  ^tokenba getw getb \92 - zero ^PushWord doif
+  ^tokenba getw getb \93 - zero ^EndBlock doif
+  ^tokenba getw getb \94 - zero ^PushPointer doif
+  ^tokenba getw getb \95 - zero ^LiteralByte doif
+  ^tokenba getw getb \96 - zero ^String doif
+  ^main to
+]=Shelta ^Shelta to
+
+; end of sheltas.she ;
+;
+  str.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
+  Demonstrates searching a list of strings.
+;
+
+[ __0 ]=head
+[ __0 ]=newn
+
+[ `Moe _0 ]=s1
+[ `Curly _0 ]=s2
+[ `Larry _0 ]=s3
+
+[ `Larry _0 ]=target     ; change this variable to test ;
+
+[ ; strz -> strz ;
+  begin dup lenz ++ malloc cpzz end
+]=strdup
+
+[ ; stooge -> void ;
+  begin
+  ^strdup do
+  ^head getw \2 ll-node dup ^newn putw ll-link
+  ^newn getw dup ^head putw ll-dptr putw
+  end
+]=add-stooge
+
+[ [ `No ] \2 outs \1 halt ]=no
+[ [ `Yes ] \3 outs \0 halt ]=yes
+
+[
+
+  ^s1 ^add-stooge do
+  ^s2 ^add-stooge do
+  ^s3 ^add-stooge do
+
+  ^head getw ^newn putw
+
+  [ ]=cloop
+    ^newn getw zero ^no toif
+    ^newn getw ll-dptr getw ^target eqzz ^yes toif
+    ^newn getw ll-next ^newn putw
+    ^cloop to
+
+] to

lib/8086/8086.she

+;
+  8086\8086.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+  Defines the instructions of the Intel 8086 chip and it's successors.
+;
+
+[ _244 ]:hlt
+
+[ _146 ]:xchg-dx-ax
+[ _147 ]:xchg-bx-ax
+[ _145 ]:xchg-cx-ax
+
+[ _80 ]:push-ax
+[ _83 ]:push-bx
+[ _81 ]:push-cx
+[ _82 ]:push-dx
+
+[ _255 _55 ]:push[bx]
+
+[ _57 _195 ]:cmp-bx-ax
+
+[ _161 ]:mov-ax[]
+[ _163 ]:mov[]ax
+
+[ _142 _6 ]:mov-es[]
+
+[ _88 ]:pop-ax
+[ _91 ]:pop-bx
+[ _89 ]:pop-cx
+[ _90 ]:pop-dx
+
+[ _95 ]:pop-di
+[ _94 ]:pop-si
+
+[ _86 ]:push-si
+[ _87 ]:push-di
+
+[ _138 _4 ]:mov-al[si]
+[ _58 _5 ]:cmp-al[di]
+
+[ _139 _5 ]:mov-ax[di]
+
+[ _211 _224 ]:shl-ax-cl
+[ _211 _232 ]:shr-ax-cl
+
+[ _209 _224 ]:shl-ax-1
+[ _209 _232 ]:shr-ax-1
+
+[ _180 ]:mov-ah
+[ _176 ]:mov-al
+[ _177 ]:mov-cl
+[ _185 ]:mov-cx
+[ _187 ]:mov-bx
+[ _179 ]:mov-bl
+
+[ _255 _208 ]:call-ax
+[ _255 _224 ]:jmp-ax
+
+[ _50 _192 ]:xor-al-al
+[ _50 _228 ]:xor-ah-ah
+[ _48 _255 ]:xor-bh-bh
+[ _38 ]:es
+
+[ _137 _195 ]:mov-bx-ax
+[ _137 _194 ]:mov-dx-ax
+[ _137 _193 ]:mov-cx-ax
+
+[ _136 _204 ]:mov-ah-cl
+[ _136 _206 ]:mov-dh-cl
+
+[ _139 _210 ]:mov-bx-dx
+[ _136 _7 ]:mov[bx]al
+[ _137 _7 ]:mov[bx]ax
+[ _137 _15 ]:mov[bx]cx
+[ _136 _15 ]:mov[bx]cl
+[ _139 _7 ]:mov-ax[bx]
+[ _138 _15 ]:mov-cl[bx]
+
+[ _136 _5 ]:mov[di]al
+
+[ _116 ]:je
+[ _117 ]:jne
+[ _114 ]:jb
+[ _119 ]:ja
+[ _235 ]:jmp
+[ _11 _192 ]:or-ax-ax
+[ _10 _192 ]:or-al-al
+[ _9 _210 ]:or-dx-dx
+[ _247 _208 ]:not-ax
+
+[ _131 _251 ]:cmp-bx
+
+[ _70 ]:inc-si
+[ _71 ]:inc-di
+
+[ _67 ]:inc-bx
+[ _74 ]:dec-dx
+
+[ _128 _228 ]:and-ah
+[ _35 _194 ]:and-ax-dx
+[ _11 _194 ]:or-ax-dx
+[ _51 _194 ]:xor-ax-dx
+
+[ _49 _201 ]:xor-cx-cx
+[ _51 _192 ]:xor-ax-ax
+[ _49 _210 ]:xor-dx-dx
+
+[ _1 _208 ]:add-ax-dx
+[ _41 _208 ]:sub-ax-dx
+[ _41 _194 ]:sub-dx-ax
+
+[ _247 _234 ]:imul-dx
+[ _247 _249 ]:idiv-cx
+
+[ _64 ]:inc-ax
+[ _72 ]:dec-ax
+
+[ _159 ]:lahf
+[ _235 ]:jmp
+[ _144 ]:nop
+
+[ _205 ]:int

lib/8086/bios.she

+;
+  8086\bios.she v1999.12.23 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+  BIOS interface for the OS-dependent part of GUPI.
+;
+
+;interrupt # for keybd ; [ _22 ]:keybd
+;interrupt # for video ; [ _16 ]:video
+
+;        void -> halt; [ pop-ax jmp _254 ]:halt
+
+;        char -> void; [ pop-ax mov-ah _14 mov-bl _15 int video ]:outc
+
+;string sizeb -> void; [ pop-dx pop-si mov-al[si]
+			 mov-ah _14 mov-bl _15 int video
+			 inc-si dec-dx or-dx-dx jne _242 ]:outs
+
+;        void -> char; [ xor-ah-ah int keybd xor-ah-ah push-ax ]:qinc
+;        void -> char; [ qinc dup outc ]:inc
+;        void -> bool; [ mov-ah _1 int keybd je _4 inc-ax jmp _3 nop xor-ax-ax push-ax ]:chkin
+;        void -> void; [ mov-ah _1 int keybd je _6 xor-ah-ah int keybd jmp _244 ]:flin
+;
+  8086\dos.she v1999.12.23 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+  DOS interface for the OS-dependent part of GUPI.
+;
+
+;interrupt # for DOS ; [ _33 ]:dos
+
+;        void -> halt; [ pop-ax mov-ah _76 int dos ]:halt
+;string sizeb -> void; [ mov-ah _64 _187 _1 _0 pop-cx pop-dx int dos ]:outs
+;        char -> void; [ mov-ah _2 pop-dx int dos ]:outc
+;        void -> char; [ mov-ah _1 int dos push-ax ]:inc
+;        void -> char; [ mov-ah _7 int dos xor-ah-ah push-ax ]:qinc
+;        void -> bool; [ mov-ah _11 int dos xor-ah-ah push-ax ]:chkin
+;        void -> void; [ xor-ax-ax mov-ah _12 int dos ]:flin
+

lib/8086/file.she

+;
+  8086\fileio.she v1999.12.23
+    (c)1999 Chris Pressey, Cat's-Eye Technologies.
+  DOS file functions.
+;
+
+;        zfnm -> fhdl ;
+  [ pop-dx mov-ah _61 xor-al-al int dos push-ax ]:fopenz
+
+; string fhdl -> fstat ;
+  [ mov-ah _63 pop-bx mov-cx __1 pop-dx int dos push-ax ]:freadc
+
+;        zfnm -> fhdl ;
+  [ pop-dx mov-ah _60 xor-cx-cx int dos ]:fcreatez
+
+;str szb fhdl -> void ;
+  [ mov-ah _64 pop-bx pop-cx pop-dx int dos ]:fwrite
+
+;        fhdl -> void ;
+  [ pop-bx mov-ah _62 int dos ]:fclose
+
+; quit? [ begin ^tokenba getw dup lenz \2 fwrite halt end ]=fnhalt ;

lib/8086/gupi.she

+;
+  8086\gupi.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+  8086-compatible semantics for GUPI.
+;
+
+; input stack -> output stack ;
+; bottom..top -> top..bottom  ;
+
+;        word -> void     ; [ pop-ax ]:pop
+;        word -> word word; [ pop-ax push-ax push-ax ]:dup
+;   wrd1 wrd2 -> wrd1 wrd2; [ pop-ax pop-bx push-ax push-bx ]:swap
+
+;        addr -> byte     ; [ pop-ax xchg-bx-ax mov-ax[bx] xor-ah-ah push-ax ]:getb
+;   byte addr -> void     ; [ pop-ax xchg-bx-ax pop-cx mov[bx]cl ]:putb
+;        addr -> word     ; [ pop-ax xchg-bx-ax push[bx] ]:getw
+;   word addr -> void     ; [ pop-ax xchg-bx-ax pop-cx mov[bx]cx ]:putw
+
+;        word -> word     ; [ pop-ax inc-ax push-ax ]:++
+;        word -> word     ; [ pop-ax dec-ax push-ax ]:--
+;        word -> word     ; [ pop-ax shl-ax-1 push-ax ]:**
+;        word -> word     ; [ pop-ax shr-ax-1 push-ax ]://
+
+;   word word -> word     ; [ pop-ax xchg-cx-ax pop-ax shl-ax-cl push-ax ]:<<
+;   word word -> word     ; [ pop-ax xchg-cx-ax pop-ax shr-ax-cl push-ax ]:>>
+
+;        addr -> void     ; [ pop-bx mov-ax[bx] inc-ax mov[bx]ax ]:@++
+;        addr -> void     ; [ pop-bx mov-ax[bx] dec-ax mov[bx]ax ]:@--
+
+;   word word -> word     ; [ pop-ax pop-dx add-ax-dx push-ax ]:+
+;   word word -> word     ; [ pop-ax pop-dx sub-dx-ax xchg-dx-ax push-ax ]:-
+;   word word -> word     ; [ pop-ax pop-dx imul-dx push-ax ]:*
+;   word word -> word     ; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-ax ]:/
+;   word word -> word     ; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-dx ]:%
+;   word word -> word word; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-dx push-ax ]:/%
+;   word word -> word     ; [ pop-ax pop-dx or-ax-dx push-ax ]:|
+;   word word -> word     ; [ pop-ax pop-dx and-ax-dx push-ax ]:&
+;   word word -> word     ; [ pop-ax pop-dx xor-ax-dx push-ax ]:~
+;        word -> word     ; [ pop-ax not-ax push-ax ]:!
+;        word -> word     ; [ pop-ax or-ax-ax je _4 xor-ax-ax jmp _1 inc-ax push-ax ]:zero
+;   word word -> word     ; [ pop-ax pop-bx cmp-bx-ax ja _4 xor-ax-ax jmp _1 inc-ax push-ax ]:>
+
+;        addr -> (call)   ; [ pop-ax call-ax ]:do
+;        addr -> (branch) ; [ pop-ax jmp-ax ]:to
+;   bool addr -> (call)   ; [ pop-ax pop-dx or-dx-dx je _2 jmp-ax ]:toif
+;   bool addr -> (branch) ; [ pop-ax pop-dx or-dx-dx je _2 call-ax ]:doif
+
+; memory for call stack:  ; [ __0 __0 __0 __0 __0 __0 __0 __0 
+                              __0 __0 __0 __0 __0 __0 __0 __0 
+                              __0 __0 __0 __0 __0 __0 __0 __0 
+                              __0 __0 __0 __0 __0 __0 __0 __0 ]=clstk
+; memory for stack pointer; [ __0 ]=clsp
+
+;      (call) -> void     ; [ pop-ax _139 _30 _^clsp _137 _135 _^clstk _131 _6 _^clsp _2 ]:begin
+;        void -> (return) ; [ _131 _46 _^clsp _2 _139 _30 _^clsp _139 _135 _^clstk push-ax _195 ]:end
+
+; sizw -> ptrw ;
+  [ mov-bx __260 mov-ax[bx]
+    pop-dx
+    push-ax
+    add-ax-dx
+    mov-bx __260 mov[bx]ax ]:malloc
+
+; ptrw -> void ;
+  [ pop-ax mov-bx __260 mov[bx]ax ]:mfree
+

lib/8086/string.she

+;
+  8086\string.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+  GUPI string-manipulation extensions.
+;
+
+; strz strz -> bool ;
+[
+  pop-di pop-si
+  mov-al[si] cmp-al[di]
+  je _5
+  xor-ax-ax
+  jmp _13
+  nop
+  or-al-al
+  je _4
+  inc-si inc-di
+  jmp _237
+  xor-ah-ah mov-al _1
+  push-ax
+]:eqzz
+
+; strz strz -> strz(2) ;
+[
+  pop-di pop-si push-di
+  mov-al[si] mov[di]al
+  or-al-al
+  je _4
+  inc-si inc-di
+  jmp _244
+]:cpzz
+
+; strz -> word ;
+[
+  pop-si
+  push-si
+  mov-al[si]
+  or-al-al
+  je _3
+  inc-si
+  jmp _247
+  pop-dx
+  push-si
+  pop-ax
+  sub-ax-dx
+  push-ax
+]:lenz
+
+[ ; strz -> strz ;
+  dup lenz ++ malloc cpzz
+]:dupz
+
+

lib/gupi/linklist.she

+;
+  gupi\linklist.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+  GUPI linked list extensions.
+;
+
+; size -> node ;       [ \2 + malloc ]:ll-node
+; next node -> void ;  [ putw ]:ll-link
+; node -> next ;       [ getw ]:ll-next
+; node -> data-ptr ;   [ \2 + ]:ll-dptr
+IDEAL
+
+;  shelta86.asm v1999.10.20 (c)1999 Chris Pressey, Cat's-Eye Technologies.
+;  Implements an assembler/compiler for the Shelta language, in 8086 assembly.
+
+;  * Special thanks to Ben Olmstead (BEM) for his suggestions for how to
+;    reduce SHELTA86.COM's size even further.
+
+MODEL	tiny
+P8086
+
+DATASEG
+
+symth		dw	symt
+codeh		dw	code
+stach		dw	stac
+safeh		dw	safe + 2
+macrh		dw	macr
+
+ttable		dw	BeginBlock, PushWord, EndBlock, PushPointer, LiteralByte ; , String
+;			[           \         ]         ^            _               `
+
+UDATASEG
+
+token		db	128 dup (?)
+
+safestart	dw	?
+namestart	dw	?
+toklength	dw	?
+
+safe		db	16384 dup (?)
+symt		db	16384 dup (?)	; 16K + 16K = 32K
+code		db	4096 dup (?)	; 
+macr		db	4096 dup (?)	; + 8K = 40K
+stac		db	256 dup (?)
+
+CODESEG
+ORG 0100h
+
+; EQUATES
+
+safeadj		EQU	(offset safe - 0104h)
+codeadj		EQU	(offset code - 0104h)
+
+; Main program.
+PROC		Main
+
+WhileFile:
+
+; ----- begin scanning token
+
+		call	ScanChar	; get char -> al
+		or	al, al
+		jz	@@EndFile
+		cmp	al, 32
+		jbe	WhileFile	; repeat if char is whitespace
+
+		mov	di, offset token
+		cld
+
+@@TokenLoop:	stosb			; put char in token
+		call	ScanChar	; get char
+		cmp	al, 32
+		ja	@@TokenLoop	; repeat if char is not whitespace
+
+@@Terminate:	mov	[byte di], 0  ; return null-terminated token
+
+; ----- end scanning token
+
+		mov	si, offset token + 1
+
+		mov	al, [byte token]
+		sub	al, '['
+		cmp	al, 4
+		ja	@@Unroll
+
+		xor	ah, ah
+		shl	ax, 1
+		xchg	bx, ax
+		mov	ax, [offset ttable + bx]
+		jmp	ax		; jump to handler as listed in ttable
+
+@@Unroll:	dec	si		; start at first character of token
+		call	LookupSymbol	; destroys DI & SI, but that's OK
+
+		; copy cx bytes from ax to codeh
+
+		xchg	ax, si
+		mov	di, [codeh]		; use di to track codeh
+		rep	movsb
+
+UpCodeH:	mov	[codeh], di
+		jmp	short WhileFile
+
+@@EndFile:	; put in a jump over the safe area
+
+		mov	ax, [safeh]
+		sub	ax, offset safe - 1
+		mov	bx, offset token	; re-use token
+		mov	[byte bx], 0e9h
+		mov	[word bx + 1], ax
+		mov	[byte bx + 3], 90h
+
+		mov	cx, 4
+		mov	dx, offset token
+		call	WriteIt
+
+		; make the first word of the safe area an offset
+		; to just past the last word of the code 
+
+		mov	cx, [safeh]
+		mov	dx, offset safe
+		sub	cx, dx
+		mov	ax, cx
+		add	ax, [codeh]
+		sub	ax, codeadj
+		mov	[word safe], ax
+
+		call	WriteIt
+
+		mov	cx, [codeh]
+		mov	dx, offset code
+		sub	cx, dx
+		call	WriteIt
+		
+		xor	al, al
+
+GlobalExit:	mov     ah, 4ch		; exit to DOS
+		int     21h
+ENDP		Main
+
+PROC		WriteIt
+
+		mov	ah, 40h
+		mov	bx, 1
+		int	21h
+		jnc	@@OK
+		mov	al, 32
+		jmp	short GlobalExit
+@@OK:		ret
+ENDP		WriteIt
+
+; -------------------------------- HANDLERS --------------------------- ;
+; When coming into any handler, di will equal the address of the null
+; (that is, the number of characters in the token + offset token)
+
+; ==== [ ==== BEGIN BLOCK ==== ;
+
+BeginBlock:	mov	di, [stach]			; push [ onto stack
+		mov	ax, [codeh]
+		stosw					; mov	[bx], ax
+		mov	[stach], di
+                jmp     WhileFile
+
+; ==== ] ==== END BLOCK ==== ;
+
+EndBlock:	;mov	si, offset token + 1	; si = token + 1 until...
+		;cmp	[byte ds:si], '='
+		;je	@@Smaller
+		;cmp	[byte ds:si], ':'
+		;je	@@Smaller
+		;jmp	short @@CarryOn
+						; remove : or = from length
+@@Smaller:	dec	di			; di left over from scanning token
+
+@@CarryOn:	mov	bx, di			; di now free to hold something until @@WName
+		sub	bx, si			; get length
+
+		mov	ax, [safeh]
+		mov	[safestart], ax
+		mov	[namestart], ax
+		xchg	ax, di			; di now holds safe area head location
+
+		mov	[toklength], bx		; length of token
+		sub	[stach], 2
+		mov	bx, [stach]		; pop [ from stack
+
+		mov	ax, [bx]		; ax = codeh when [ happened
+
+		mov	bp, [codeh]		; find length
+		sub	bp, ax
+		; mov	bp, bx			; bp = length of data between [ ... ]
+						; until @@WName below... ugh
+
+		cmp	[stach], offset stac
+		je	@@StackEmpty
+
+
+		mov	bx, [stach]
+		sub	bx, 2
+		mov	cx, [bx]
+
+		; namestart = [namestart] - (cx - ax)
+
+		sub	cx, ax
+		sub	[namestart], cx
+
+		; if dlength > 0,
+
+@@StackEmpty:	;or	bp, bp
+		;jz	@@Empty
+
+		cmp	[byte si], ':'		; si still = offset token + 1
+		jne	@@PreCopyLoop
+
+		mov	di, [macrh]		; use macro area instead of safe if :
+		mov	[namestart], di
+
+		; copy everything from ax to codeh into the di area
+
+@@PreCopyLoop:	mov	dx, ax
+		mov	cx, bp 	; 		[codeh]		sub	cx, ax
+		push	si
+		xchg	si, ax
+		rep	movsb
+		pop	si
+
+		; change codeh back to dx (old codeh before [)
+
+		mov	[codeh], dx
+
+		;mov	si, offset token + 1
+		cmp	[byte si], ':'		; si still = offset token + 1
+		je	@@UpdateMacr
+
+		mov	[safeh], di
+		jmp	short @@Empty
+@@UpdateMacr:	mov	[macrh], di
+		;jmp	short @@NameIt
+
+		; write push instruction if '=' or ':' not used
+
+@@Empty:	;cmp	[byte si], '='			; si still = offset token + 1
+		;je	@@NameIt
+
+		;mov	ax, [safestart]
+		;sub	ax, safeadj
+		;mov	bx, [word codeh]
+		;mov	[byte bx], 0b8h
+		;mov	[word bx + 1], ax
+		;mov	[byte bx + 3], 50h
+		;add	[codeh], 4
+
+		;cmp	[byte si], 0			; still offset token + 1!
+                ;je      @@Anonymous
+
+		; insert namestart into dictionary
+
+@@NameIt:	mov	cx, [namestart]
+		mov	ax, [toklength]
+
+		;cmp	[byte si], '='
+		;je	@@Bigger
+		;cmp	[byte si], ':'
+		;je	@@Bigger
+		;jmp	short @@WName
+
+@@Bigger:	inc	si
+
+@@WName:	; Destroys DI but that's OK.
+		; INPUT:  bx = ADDRESS of token to insert, ax = length of symbol,
+		; cx = pointer to data, dx = length of data
+		; OUTPUT: ds:bx = pointer to newly allocated symbol
+
+		mov	di, [symth]		; di no longer contains macrh/safeh
+		add	ax, 6			; 1 word for length, 1 for ptr, 1 for data length
+		add	[symth], ax
+
+		stosw	; mov	[word di], ax	; place ax length in symt
+
+		sub	ax, 6
+		xchg	cx, ax			; cx <- ax; ax <- cx
+		stosw	; mov	[word di], cx	; place cx (ptr to data)
+		xchg	ax, bp		
+		stosw	; mov	[word di], bp	; place bp (ptr length)
+
+		rep	movsb
+
+		mov	[symth], di
+
+@@Anonymous:    jmp     WhileFile
+
+; ==== ^ ==== PUSH POINTER ==== ;
+
+PushPointer:	;mov	si, offset token + 1
+		call	LookupSymbol		; destroys di & si, should be OK
+
+		sub	ax, safeadj
+		mov	di, [word codeh]
+		jmp	short WritePush
+
+; ==== ` ==== STRING ==== ;
+;
+;String:		;mov	si, offset token + 1
+;		mov	di, [codeh]
+;@@Loop:		mov	al, [byte ds:si]
+;		stosb
+;		inc	si
+;		cmp	[byte ds:si], 0
+;		jne	@@Loop
+;                jmp     UpCodeH
+
+; ==== _ ==== LITERAL BYTE ==== ;
+
+LiteralByte:	;mov	si, offset token + 1
+		cmp	[byte si], '_'
+		je	LiteralWord
+		cmp	[byte si], '^'
+		je	LiteralSymbol
+		call	DecipherDecimal		; destroys DI, that's OK
+		stosb	; mov	[byte bx], al
+CheapTrick:	mov	[codeh], di
+                jmp     WhileFile
+
+; ==== __ ==== LITERAL WORD ==== ;
+
+LiteralWord:	inc	si
+		call	DecipherDecimal		; destroys DI, that's OK
+FunkyTrick:	stosw	; mov	[word bx], ax
+		jmp	short CheapTrick
+
+; ==== _^ ==== LITERAL SYMBOL ==== ;
+
+LiteralSymbol:	inc	si
+		call	LookupSymbol		; destroys DI & SI, that's OK
+
+		sub	ax, safeadj
+
+		mov	di, [word codeh]
+		jmp	short FunkyTrick
+		;mov	[word bx], ax
+		;inc	[codeh]
+		;jmp	short CheapTrick
+
+; ==== \ ==== PUSH WORD ==== ;
+
+PushWord:	;mov	si, offset token + 1
+		call	DecipherDecimal		; destroys di, that's OK
+
+WritePush:	mov	[byte di], 0b8h	; B8h, low byte, high byte, 50h
+		inc	di
+		stosw   ;	mov	[word di + 1], ax
+		mov	al, 50h
+		stosb
+		mov	[codeh], di
+                jmp     WhileFile
+
+; -------------------------------- SUBROUTINES --------------------------- ;
+
+PROC		DecipherDecimal   ; uses and destroys DI
+		; INPUT: si = address of token
+		; OUTPUT: ax = value, di = codeh
+
+
+		xor	di, di
+
+@@Loop:		lodsb	; mov	al, [byte ds:si], inc si
+
+		mov	bx, di
+		mov	cl, 3
+		shl	bx, cl
+		mov	cx, di
+		shl	cx, 1
+		add	bx, cx
+
+		sub	al, '0'
+		cbw
+		add	bx, ax
+		mov	di, bx
+
+		cmp	[byte ds:si], '0'
+		jae	@@Loop
+
+		xchg	ax, di
+		mov	di, [word codeh]
+		ret
+ENDP		DecipherDecimal
+
+PROC            ScanChar
+; Scans a single character from the input file, placing
+; it in register al, which will be 0 upon error
+; or eof (so don't embed nulls in the Shelta source...)
+
+		mov	ah, 7		; read from stdin one byte
+		int	21h
+		cmp	al, ';'		; check for comment
+		je	@@Comment
+		ret
+@@Comment:	mov	ah, 7		; read from stdin one byte
+		int	21h
+		cmp	al, ';'		; check for comment
+		jne	@@Comment
+		jmp	short ScanChar
+
+ENDP            ScanChar
+
+PROC		LookupSymbol
+		; INPUT:  si = address of symbol to find, di = address of null termination
+		; OUTPUT: ds:ax = pointer to contents or zero if not found
+		; cx = length of contents
+
+		mov	bx, offset symt		; bx starts at symbol table
+		mov	bp, si
+		sub	di, si
+
+@@Loop:		mov	ax, [word bx]		; first word = token size
+
+		mov	dx, bx			; keep track of start of this symt entry
+
+		sub	ax, 6
+		cmp	ax, di
+		jne	@@Exit			; if it doesn't fit, you must acquit
+
+		add	bx, 6			; bx now points to token in symbol table
+
+;   exit if right token
+
+		xor	si, si			; reset si to token
+@@Inner:	mov	al, [byte ds:bx]	; get byte from bx=symt
+		cmp	[byte bp + si], al	; compare to si=token
+		jne	@@Exit
+		inc	bx
+		inc	si
+		cmp	si, di			; hit the length yet?
+		jb	@@Inner			; no, repeat
+
+		;   a match!
+
+		mov	bx, dx
+		mov	cx, [word bx + 4]	; third word = data length
+		mov	ax, [word bx + 2]	; second word = data ptr 
+		ret
+
+@@Exit:		mov	bx, dx
+		mov	ax, [word bx]
+		add	bx, ax
+		cmp	bx, [symth]
+		jb	@@Loop
+
+		mov	al, 16		; return 16 if unknown identifier
+		jmp	GlobalExit
+
+ENDP		LookupSymbol
+
+END		Main
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.