Commits

Anonymous committed 3a6eccd

Initial import of Kosheri sources.

Comments (0)

Files changed (52)

+Kosheri
+=======
+
+Kosheri is a stack-based virtual machine.  Here's what it offers:
+
+* Garbage-collection, tagged tuples, dictionaries, function values,
+  coroutines, lightweight processes, textual and binary serialization,
+  and an assembler and disassembler.  See "Features", below, for
+  more information.
+* A really small code footprint.  On Ubuntu 10.04 LTS, a minimal VM
+  executor compiles to an executable less than 16K in size.
+* A clean, almost pedantic implementation.  See "Implementation", below.
+* Very few build dependencies.  See "Build requirements", below.
+* An extremely orthogonal architecture.  See "Architecture", below.
+* A reasonably efficient implementation.  See "Performance", below.
+
+Features
+--------
+
+* Ability to manipulate values of boolean, 32-bit integer,
+  process reference ("pid"), immutable string ("symbol"), and
+  tagged tuple type.
+* Support for manipulating certain tagged tuple types:
+  dictionaries, function values, and activation records.
+* Support for closures and coroutines via appropriate use of
+  activation records.  (ARs retain some state after being
+  called; if this is not cleared, they can be continued.)
+* A simple, mark-and-sweep garbage collector (for tuples and
+  symbols; everything else lives on the stack.)
+* Concurrent operation.  Each lightweight process can be a
+  VM process or a native process.  Native processes are used to
+  implement interfaces to the rest of the world.  Multitasking
+  is pre-emptive for VM processes, and co-operative for native
+  processes.  Concurrency is implemented in the VM; system threads
+  and processes are not used.  Interprocess communication is done
+  with Erlang-style messaging to processes' mailboxes.
+
+Implementation
+--------------
+
+The code is generally written in BSD style(9).  It compiles under
+`-ansi -pedantic` with almost all `gcc` warnings enabled and treated as
+errors.  The core interfaces are `const`ified.  `assert`s are
+plentiful.  I've tried to limit the amount of typecasting used in the
+code, but of course there are some places where it is necessary.
+
+Functions are often written in a straightforward, almost pedantic
+fashion; there are of course some exceptions here too, like the
+direct threading support.
+
+Build requirements
+------------------
+
+Only the following tools are required to build the VM and its tools:
+
+* an ANSI C compiler (default is `gcc`) and linker (`ranlib`, `ar`)
+* GNU Make
+
+I have vague plans to remove even the dependency on GNU Make (the
+code is so small that rebuilding the whole thing is not a huge burden,
+so it could be done with a shell script or a custom build tool.)
+
+Even `libc` is not a strict requirement for building the VM; a
+few functions from `libc` that the code uses are implemented
+independently in the code.  The `standalone` target builds the
+system with `-nostdlib`.  However, while it builds it currently
+does not link, as it requires some memory allocator to link to,
+and it doesn't have one... yet.
+
+Architecture
+------------
+
+From inside the virtual machine, it's a truism that "everything's
+a value, and every value that's not an atomic value is a tagged tuple."
+The latter is true even for VM code and activation records and
+dictionaries.
+
+Moreover, this truism largely holds in the C code as well.  In many
+places in the assembler and elsewhere, we don't just use C structs
+to hold our data, we use `struct value`s.
+
+Every value has defined external representations, in human-readable and
+compact binary forms, which can be generated and parsed.
+
+The rest of this section is kind of scattered...
+
+Some more notes on how the runtime is written in C...
+
+Every C function in the Kosheri runtime is run in some environment.
+
+We try to assume as little about the surrounding environment (e.g., OS)
+as possible, and to present the world to these functions in as Kosheri-
+like a way as possible.
+
+Maybe not every function, but lots of functions.  The ones that really
+matter.  Top-level functions especially -- (the ones that would be
+command-line utilities in a Unix system.)
+
+Every such function is passed a runtime environment.  This runtime
+environment consist of:
+
+-- arguments that were passed to the function in the form of a Kosheri
+dictionary (value_dict).  The function may also have a data structure
+associated with it (the "argdecl") that facilitates checking these
+arguments for syntactic correctness and automatically returning an error
+if they don't meet those checks.  (this part's not done yet)
+
+-- environment variables that were made available to the function.
+(This is possibly not OK.  Environment variables have a way of hanging
+around that arguments don't.  Some kind of adapter is called for that
+translates (relevant) env vars to arguments first.)
+
+-- streams available to the function, in the form of Kosheri processes.
+The function should not assume the presence of a particular stream, or
+at least, a function which does assume the presence of a particular
+stream should not be assumed to always work.  The function should also
+not assume that the streams that are made available to it behave in a
+particular way (for example, that a stream is directed to a vt100
+terminal.)  Instead it should look for a stream that abstractly presents
+the behaviour that it wants.
+
+Performance
+-----------
+
+Kosheri was designed to be reasonably efficient, for a virtual machine
+without a JIT compiler.
+
+The design decisions for performance are kind of all over the place,
+sometimes being extremely optimized, sometimes sacrificing pure
+performance for flexibility.
+
+* "Direct threading" is possible with C compilers that support it,
+  such as `gcc`.  This basically optimizes the main instruction-
+  selection `switch` into a computed `goto`.
+
+* The compiled VM is small, really small.  This means it can usually
+  fit entirely in the cache, and stay there.  This can sometimes result
+  in a significant performance benefit.
+
+On the other hand...
+
+* The orthogonality of "everything is a tuple" extends far and wide.
+  All values are 8 bytes, including VM instructions.  VM code size
+  could be reduced by packing 2 or 4 instructions into that 8 bytes,
+  but we don't do that yet.
+
+* Input and output are modelled as processes, which means there is
+  some small overhead (to pack and unpack a message) added to I/O;
+  but I/O is I/O-bound anyway, so this is probably not something to
+  worry about.
+
+
+Tour of the distribution
+------------------------
+
+    bin/
+
+Where compiled binaries go.
+
+    eg/
+
+Example assembly language files.
+
+    lib/
+
+Where compiled libraries (static and dynamic) go.
+
+    src/
+
+Source code for the virtual machine and tools.
+
+    src/build/
+
+Source code for tools used during build-time, and not thereafter.
+
+    tests/
+
+Tests.
+
+
+Tour of the source
+------------------
+
+    assemble.c
+
+Main program for the assembler.
+
+    chain.c
+    chain.h
+
+Utilities to accumulate a linked list of values, then turn them
+into a tuple when accumulation has finished.
+
+    cmdline.c
+    cmdline.h
+
+Harness which translates command-line arguments to dictionaries,
+and otherwise provides facilities so that programs can be written
+in a "Kosheri view" of the outside world.
+
+    disasm.c
+
+Main program for the disassembler.
+
+    discern.c
+    discern.h
+
+Routines to parse human-readable representation of tuples into
+the internal format.
+
+    file.c
+    file.h
+
+Native implementation of processes supporting the stream-like interface
+and which are backed with C's stdio.
+
+    freeze.c
+
+Main program for the human-readable-value-to-compact-binary
+serializer.
+
+    gen.c
+    gen.h
+
+Routines for generating VM instructions (used by assembler, but
+would also be useful for compilers targeting this VM.)
+
+    geninstr.c
+
+A build tool which generates instrtab.c and instrenum.h from
+vm.c.
+
+    instrtab.h
+
+Header file for the generated instrtab.c.
+
+    lib.c
+    lib.h
+
+Clean-room re-implementations of the few libc-supplied functions
+which the VM uses.
+
+    load.c
+    load.h
+
+Routines to parse (unserialize) the compact binary representation
+of values.
+
+    portray.c
+    portray.h
+
+Routines to format values into a human-readable representation.
+
+    process.c
+    process.h
+
+Routines for communicating and switching between co-operative
+lightweight concurrent processes.
+
+    render.c
+    render.h
+
+Routines for rendering terms to file-like processes.
+
+    report.c
+    report.h
+
+Routines for reporting errors, successes, etc. for the assembler,
+disassembler, freezer and thawer.
+
+    run.c
+
+Main program for the VM runner.  Takes a VM program in the compact
+binary representation and executes it.
+
+    save.c
+    save.h
+
+Routines to generate (serialize) the compact binary representation
+of values.
+
+    scan.c
+    scan.h
+
+Lexical scanner, used by the assembler, and by the discern routines.
+Fairly general, so could also be used to parse a language being
+compiled to the VM.
+
+    stream.c
+    stream.h
+
+Routines to communicate with processes which support the stream-like
+interface; you can read from them, write to them, check for eof, and
+close them.
+
+    thaw.c
+
+Main program for the compact-binary-to-human-readable-value
+unserializer.
+
+    value.c
+    value.h
+
+Data structure (and associated functions) which internally
+represents values, including atomic values (booleans, integers,
+process id's, direct-threaded opcodes) and structured values
+(tagged tuples and symbols.)
+
+    vm.c
+    vm.h
+
+The virtual machine implementation itself.
+
+    vmproc.c
+    vmproc.h
+
+The implementation of lightweight processes which uses the virtual machine
+to execute the process.
+TODO
+====
+
+Correctness
+-----------
+
+* RECV instruction.
+
+* Investigate why seemingly big-enough ARs are actually not big enough.
+
+* Immutable values.  This will help solve the problems of using
+  values as keys in a dictionary, and the problem of (not) sharing
+  data between processes.
+
+* (BIG) Stack type checker in assembler.
+
+* (BIG) Fix file processes to really be concurrent, esp. in read.
+  Use `select` (maybe export fd's to the scheduler.)
+
+* (BIG) Error handling.  Instead of just `assert()` (which isn't even there
+  when you define `NDEBUG`), throw an exception if the unexpected
+  happens.  Or, send a message to the process which created this
+  process.  (This will require `self` be accessible throughout the code
+  somehow.)
+
+Testing
+-------
+
+* Falderal tests for all variants of conditionals.
+
+* Falderal test for closures.
+
+* Falderal test for sending and receiving messages.
+
+* Falderal tests for dictionaries.
+
+* Falderal test for portraying cyclic values.
+
+* Falderal test for saving cycling values.
+
+Accessibility
+-------------
+
+* Document C APIs (header files).
+
+* enum success { SUCCESS = 1; FAILURE = 0; } and return this instead of int.
+
+* Portray values from stream_render: %v or similar.
+
+* Figure out way to debug messages received by file process
+  (can't write to file process, or you'll get an infinite loop.)
+
+Portability
+-----------
+
+* Measure sizeof struct value et al in buildinfo.  Spit out defines.
+  Think about how 64-bit systems will handle all this.
+
+Footprint
+---------
+
+* "small immediates" embedded in appropriate VM instructions.
+
+* Go back to having "functional values", but basically say
+  that these are ARs with zero space allocated for arguments / locals.
+  Then have a VM instruction to create an AR from a functional value.
+  (Maybe also have a VM instruction to create a full AR instead of a
+  functional value, from a label, for when you want to call it one-off.)
+
+Performance
+-----------
+
+* Allow symbols (/strings) to be interned.  There is of course a
+  tradeoff here, so "allow" rather than "require".
+  Maybe have VALUE_TAG being "small strings" (fit in a struct value,
+  so, like, 7 characters).  Then many problems go away...?
+
+* Option to create a (non-interned) symbol from a const string in
+  a way that does not copy the const string.
+
+* More efficient access of free variables.  Split each AR into two sections: bound variables
+  and free variables.  The bound variables are stored in the AR itself.  Free variables are
+  stored in some other AR; pointers to them are stored in this AR.  Then accessing a bound
+  variable is only a single indirection.  Tradeoff is that more work needs to be done when
+  creating a functional value.
+
+* In ARs, store top-of-stack pointer as a machine pointer,
+  not a tuple index.
+#!/bin/sh
+
+cd src
+echo "Building 'debug' version..."
+make clean debug >ERRORS 2>&1
+if [ $? != 0 ]; then
+	cat ERRORS
+	rm -f ERRORS
+	exit 1
+fi
+rm -f ERRORS
+./assemble --asmfile ../$1 --vmfile out.kvm
+gdb --args ./run --vmfile out.kvm
+rm -f out.kvm
+NEW_AR #10
+GOTO :past_q
+:q
+			; no need to reserve space for parameters
+GETI #0		; local #0 = 1st parameter = a
+STDOUT
+PORTRAY
+
+GETI #1		; local #1 = 2nd parameter = b
+STDOUT
+PORTRAY
+
+GETI #0
+GETI #1
+ADD_INT
+
+YIELD #1		; pass the result back to our caller
+RET			; end of this function
+
+:past_q
+
+PUSH #1
+
+PUSH #2
+PUSH #3
+PUSH #10		; this function will need 4 slots: 2 args, 2 stack
+FUN :q		; push a closure for q onto the stack
+CALL #2		; call it with two args
+
+PUSH #10		; this function will need 4 slots: 2 args, 2 stack
+FUN :q		; push a closure for q onto the stack
+CALL #2		; call it with two args
+
+STDOUT
+PORTRAY
+
+HALT
+NEW_AR #3
+PUSH #11	; statically initialize our local
+:label
+GETI #0
+STDOUT
+PORTRAY
+GETI #0
+PUSH #1
+SUB_INT
+SETI #0
+GETI #0
+PUSH #0
+JNE :label
+HALT
+NEW_AR #7
+SPAWN :label
+REST
+PUSH #main
+STDOUT
+PORTRAY
+HALT
+:label
+NEW_AR #7
+PUSH #worker
+STDOUT
+PORTRAY
+HALT
+# Makefile for Kosheri.
+# $Id: Makefile 146 2008-12-06 19:59:54Z catseye $
+
+CC?=gcc
+AR?=ar
+RANLIB?=ranlib
+
+O?=.o
+EXE?=
+
+OD?=./
+
+DEBUG_PORTRAY_O?=
+
+WARNS=	-Werror -W -Wall -Wstrict-prototypes -Wmissing-prototypes \
+	-Wpointer-arith	-Wno-uninitialized -Wreturn-type -Wcast-qual \
+	-Wwrite-strings -Wswitch -Wshadow -Wcast-align -Wchar-subscripts \
+	-Winline -Wnested-externs -Wredundant-decls
+LIBS=-L. -lruntime
+CFLAGS+=-ansi -pedantic ${WARNS} ${EXTRA_CFLAGS}
+
+RUNTIME_OBJS=	${OD}lib${O} ${OD}value${O} \
+		${OD}process${O} ${OD}file${O} ${OD}stream${O} \
+                ${DEBUG_PORTRAY_O} \
+		${OD}render${O}
+# ${OD}portray${O}
+
+# portray and save are only needed for non-raw output
+# instrtab is only needed for direct threading conversion
+RUN_OBJS=	${OD}run${O} \
+		${OD}load${O} \
+		${OD}vm${O} ${OD}vmproc${O} \
+		${OD}instrtab${O} \
+		${OD}portray${O} \
+                ${OD}save${O} \
+		${OD}cmdline${O}
+
+ASSEMBLE_OBJS=	${OD}assemble${O} \
+		${OD}instrtab${O} \
+		${OD}report${O} ${OD}scan${O} ${OD}discern${O} ${OD}chain${O} \
+		${OD}gen${O} ${OD}save${O} \
+		${OD}portray${O} \
+		${OD}cmdline${O}
+
+DISASM_OBJS=	${OD}disasm${O} \
+		${OD}instrtab${O} \
+		${OD}report${O} \
+		${OD}load${O} \
+		${OD}portray${O} \
+		${OD}chain${O} \
+		${OD}cmdline${O}
+
+FREEZE_OBJS=	${OD}freeze${O} \
+		${OD}save${O} \
+		${OD}report${O} ${OD}scan${O} \
+		${OD}discern${O} ${OD}chain${O} \
+		${OD}cmdline${O}
+
+THAW_OBJS=	${OD}thaw${O} \
+		${OD}load${O} \
+		${OD}report${O} \
+                ${OD}portray${O} \
+		${OD}cmdline${O}
+
+BUILDINFO_OBJS=	${OD}buildinfo${O} \
+		${OD}process${O} \
+		${OD}cmdline${O}
+
+PROGS=run${EXE} assemble${EXE} disasm${EXE} freeze${EXE} thaw${EXE} \
+		buildinfo${EXE}
+
+all: ${PROGS}
+
+geninstr${EXE}: geninstr.o
+	${CC} geninstr.o -o geninstr${EXE}
+
+instrtab.c instrenum.h instrlab.h localtypes.h: vm.c geninstr
+	./geninstr vm.c
+
+value.h: localtypes.h
+
+value.c: value.h
+
+vm.h: instrenum.h instrlab.h value.h
+
+libruntime.a: ${RUNTIME_OBJS}
+	${AR} rc libruntime.a ${RUNTIME_OBJS}
+	${RANLIB} libruntime.a
+
+run.c: vm.h
+
+run${EXE}: ${RUN_OBJS} libruntime.a
+	${CC} ${RUN_OBJS} ${LIBS} -o run${EXE}
+assemble${EXE}: ${ASSEMBLE_OBJS} libruntime.a
+	${CC} ${ASSEMBLE_OBJS} ${LIBS} -o assemble${EXE}
+disasm${EXE}: ${DISASM_OBJS} libruntime.a
+	${CC} ${DISASM_OBJS} ${LIBS} -o disasm${EXE}
+freeze${EXE}: ${FREEZE_OBJS} libruntime.a
+	${CC} ${FREEZE_OBJS} ${LIBS} -o freeze${EXE}
+thaw${EXE}: ${THAW_OBJS} libruntime.a
+	${CC} ${THAW_OBJS} ${LIBS} -o thaw${EXE}
+
+buildinfo${EXE}: ${BUILDINFO_OBJS} libruntime.a
+	${CC} ${BUILDINFO_OBJS} ${LIBS} -o buildinfo${EXE}
+
+
+# when DEBUG is defined, save.o, load.o, and parse.o depend on portray.o
+debug: clean
+	${MAKE} DEBUG_PORTRAY_O="${OD}portray${O}" \
+                EXTRA_CFLAGS="-g -DDEBUG"
+
+profiled: clean
+	${MAKE} EXTRA_CFLAGS="-pg"
+
+tool: clean
+	${MAKE} EXTRA_CFLAGS="-DNDEBUG -Os" LIBS="-L. -lruntime -s"
+
+#  -finline-functions ?
+
+static: clean
+	${MAKE} EXTRA_CFLAGS="-DNDEBUG -Os" LIBS="-L. -lruntime -s -static"
+
+standalone: clean
+	${MAKE} EXTRA_CFLAGS="-DNDEBUG -DSTANDALONE -DCTYPE_IS_BUILTIN -Os -static -nostdlib" LIBS="-L. -lruntime -nostdlib -s"
+
+win: clean
+	${MAKE} EXE=.exe EXTRA_CFLAGS="-DNDEBUG -mno-cygwin" all
+
+wintool: clean
+	${MAKE} EXE=.exe EXTRA_CFLAGS="-DNDEBUG -Os -static -mno-cygwin" LIBS="-L. -lruntime -s"
+
+clean:
+	rm -rf ${OD}*${O} *.so *.a *.core *.vm *.exe instrtab.c instrenum.h instrlab.h geninstr *.stackdump ${PROGS} foo.* LISTING OUTPUT
+/*
+ * assemble.c
+ * Assembler for virtual machine.
+ * $Id: assemble.c 146 2008-12-06 19:59:54Z catseye $
+ */
+
+#include "lib.h"
+#include "cmdline.h"
+#include "value.h"
+
+#include "stream.h"
+#include "file.h"
+
+#include "scan.h"
+#include "discern.h"
+
+#include "report.h"
+
+#include "gen.h"
+#include "instrtab.h"
+#include "save.h"
+
+static struct value labels;
+
+/* Routines */
+
+static struct value *
+fetch_label(const char *token, unsigned int length)
+{
+        struct value lab_text;
+
+        value_symbol_new(&lab_text, token, length);
+	return value_dict_fetch(&labels, &lab_text);
+}
+
+static void
+store_label(const char *token, unsigned int length, struct value *label)
+{
+        struct value lab_text;
+
+        value_symbol_new(&lab_text, token, length);
+	value_dict_store(&labels, &lab_text, label);
+}
+
+static void
+assemble(struct scanner *sc, struct value *gen)
+{
+	struct opcode_entry *oe;
+	struct value label;
+	int count;
+
+	while (!scanner_eof(sc)) {
+		if (scanner_tokeq(sc, ";")) {	/* A comment - ignore rest of line. */
+			scanner_scanline(sc);
+			continue;
+		}
+		if (scanner_tokeq(sc, ":")) {	/* A label - associate string with addr. */
+                        const char *str;
+                        int len;
+
+			scanner_scan(sc);
+                        str = scanner_token_string(sc);
+                        len = scanner_token_length(sc);
+			value_copy(&label, fetch_label(str, len));
+			/* This may cause backpatching */
+			if (gen_define_label(gen, &label)) {
+				store_label(str, len, &label);
+			} else {
+				scanner_report(sc, REPORT_ERROR, "Label already defined");
+			}
+			scanner_scan(sc);
+			continue;
+		}
+
+		for (oe = opcode_table; oe->token != NULL; oe++) {
+			if (scanner_tokeq(sc, oe->token))
+				break;
+		}
+		if (oe->token == NULL) {
+			scanner_report(sc, REPORT_ERROR, "Unrecognized token");
+			scanner_scan(sc);
+			continue;
+		}
+
+		gen_integer(gen, oe->opcode);
+		scanner_scan(sc);
+
+		for (count = 0; count < oe->arity; count++) {
+			if (scanner_tokeq(sc, ":")) {
+                                const char *str;
+                                int len;
+
+				scanner_scan(sc);           
+                                str = scanner_token_string(sc);
+                                len = scanner_token_length(sc);
+				value_copy(&label, fetch_label(str, len));
+				gen_gen_label_ref(gen, &label);
+				store_label(str, len, &label);
+				scanner_scan(sc);
+				continue;
+			} else if (scanner_tokeq(sc, "#")) {
+				struct value v;
+
+				scanner_scan(sc);
+				value_discern(&v, sc);
+				gen_value(gen, &v);
+				continue;
+			} else {
+				scanner_report(sc, REPORT_WARNING,
+				    "Unrecognized argument");
+				scanner_scan(sc);
+			}
+		}
+	}
+}
+
+static void
+assemble_main(struct value *args, struct value *result)
+{
+	struct process *out;
+	struct scanner *sc;
+        struct reporter *r;
+	struct value gen, flat; /* the generator that we will use to build vm code */
+	struct value *asmfile, *vmfile;
+        struct value asmfile_sym, vmfile_sym;
+
+  	r = reporter_new("Assembly", NULL, 1);
+
+        value_symbol_new(&asmfile_sym, "asmfile", 7);
+        value_symbol_new(&vmfile_sym, "vmfile", 6);
+
+        assert(value_is_tuple(args));
+  	asmfile = value_dict_fetch(args, &asmfile_sym);
+	vmfile = value_dict_fetch(args, &vmfile_sym);
+
+	/*
+	 * Generate.
+	 */
+	sc = scanner_new(r);
+  
+	if (!scanner_open(sc, value_symbol_get_token(asmfile))) {
+            value_integer_set(result, 1);
+            return;
+        }
+        
+	gen_new_default(&gen);
+	value_dict_new(&labels, 8);
+	assemble(sc, &gen);
+	gen_integer(&gen, INSTR_EOF);
+	scanner_close(sc);
+	scanner_free(sc);
+
+	/*
+	 * Write out.
+	 */
+	out = file_open(value_symbol_get_token(vmfile), "w");
+        gen_flatten(&gen, &flat);
+	value_save(out, &flat);
+	stream_close(NULL, out);
+
+	value_integer_set(result, reporter_has_errors(r) ? 1 : 0);
+        reporter_free(r);
+}
+
+MAIN(assemble_main)
+/*
+ * buildinfo.c
+ * Show some info about the built binary (also test for process-based I/O)
+ */
+
+#include "lib.h"
+
+#include "render.h"
+#include "cmdline.h"
+
+#include "process.h"
+#include "file.h"
+
+/* Main Program / Driver */
+
+static void
+buildinfo_main(struct value *args, struct value *result)
+{
+	struct process *out;
+
+	out = file_open("*stdout", "w");
+	args = args;
+	process_render(out,
+	    "sizeof(struct value) == %d\n", sizeof(struct value));
+        value_integer_set(result, 0);
+}
+
+MAIN(buildinfo_main)
+/*
+ * chain.c
+ * Routines for collecting chains.
+ * $Id: chain.c 143 2008-07-18 06:42:22Z catseye $
+ */
+
+#include "lib.h"
+#include "chain.h"
+#include "value.h"
+
+struct chain {
+	struct chain	*next;
+	struct value	 value;
+};
+
+struct chain *
+add_to_chain(struct chain *c, struct value *v)
+{
+	struct chain *n;
+
+	n = malloc(sizeof(struct chain));
+	n->next = NULL;
+	value_copy(&n->value, v);
+	if (c != NULL) {
+		c->next = n;
+	}
+	return n;
+}
+
+void
+populate_tuple_from_chain(struct value *v, struct chain *c)
+{
+	unsigned int i = 0;
+
+	while (c != NULL) {
+		value_tuple_store(v, i, &c->value);
+		i++;
+		c = c->next;
+	}
+}
+
+struct value *
+search_chain(struct chain *c, struct value *f)
+{
+	while (c != NULL) {
+		if (value_equal(f, &c->value))
+			return &c->value;
+		c = c->next;
+	}
+	return NULL;
+}
+
+void
+free_chain(struct chain *c)
+{
+	struct chain *next;
+
+	while (c != NULL) {
+		next = c->next;
+		free(c);
+		c = next;
+	}
+}
+/*
+ * chain.h
+ * Prototypes for chains.
+ */
+
+#ifndef __CHAIN_H_
+#define __CHAIN_H_
+
+#include "value.h"
+
+struct chain;
+
+struct chain	*add_to_chain(struct chain *, struct value *);
+void		 populate_tuple_from_chain(struct value *, struct chain *);
+struct value	*search_chain(struct chain *, struct value *);
+void		 free_chain(struct chain *);
+
+#endif /* !__CHAIN_H_ */
+/*
+ * cmdline.c
+ * Implementation of common command-line parsing functionality.
+ * $Id$
+ */
+
+#include "lib.h"
+#include "render.h"
+#include "process.h"
+#include "file.h"
+
+#include "cmdline.h"
+
+struct process	*process_std = NULL;
+struct process	*process_err = NULL;
+
+/*
+static const char *progname = NULL;
+static const struct argdecl *argdecl = NULL;
+*/
+
+/* eventually this should parse an optional cmdline schema to validate the cmd line */
+/* const struct argdecl *ad,  */
+
+int
+cmdline_parse(struct value *dict, int argc, char **argv)
+{
+    unsigned int arg_count, i;
+    struct value name, value;
+
+    assert(argc > 0);
+    arg_count = (unsigned int)argc;
+
+    value_dict_new(dict, 16); /* xxx */
+
+    for (i = 1; i < arg_count; i++) {
+        if (strlen(argv[i]) > 2 && argv[i][0] == '-' && argv[i][1] == '-') {
+            const char *arg_name = argv[i] + (2 * sizeof(char));
+            value_symbol_new(&name, arg_name, strlen(arg_name));
+            i++;
+            if (i < arg_count) {
+                value_symbol_new(&value, argv[i], strlen(argv[i]));
+                value_dict_store(dict, &name, &value);
+            } else {
+              /* complain */
+            }
+        } else {
+          /* complain */
+        }
+    }
+
+    return 1;
+}
+
+/* eventually this should parse an optional cmdline schema to validate the cmd line */
+
+void
+cmdline_usage(void)
+{
+    exit(1);
+}
+
+int
+cmdline_driver(void (*main)(struct value *, struct value *),
+               int argc, char **argv)
+{
+    struct value cmdline, result;
+
+    cmdline_parse(&cmdline, argc, argv);
+    assert(value_is_tuple(&cmdline));
+
+    main(&cmdline, &result);
+
+    return value_get_integer(&result);
+}
+/*
+ * cmdline.h
+ * Prototypes for common command-line parsing functionality.
+ * $Id$
+ */
+
+/*
+ * In the future, this should be progenv.h, which determines a
+ * program environment.  This should include an "entry point"
+ * (not necessarily "main", since it should be able to operate
+ * in standalone and other contexts where a "command line" is
+ * not meaningful.)
+ */
+
+#ifndef __CMDLINE_H_
+#define __CMDLINE_H_
+
+#include "value.h"
+
+#include "file.h"   /* ... because. */
+
+struct process;
+
+struct runenv {
+        struct handler	     *handler_std;
+	/*
+        struct stream	     *stream_std;
+        struct stream	     *stream_err;
+	*/
+        const char           *progname;
+};
+
+extern struct process	*process_err;
+
+int             cmdline_driver(void (*)(struct value *, struct value *),
+                               int, char **);
+int             cmdline_parse(struct value *, int, char **);
+void            cmdline_usage(void);
+
+/*
+ * STANDALONE no longer supported / not yet re-supported
+ */
+#ifdef STANDALONE
+#define	MAIN(function)							\
+	int main(int argc, char **argv)					\
+	{								\
+		process_err = NULL;					\
+                return cmdline_driver(function, argc, argv);		\
+	}
+#else
+#define	MAIN(function)							\
+	int main(int argc, char **argv)					\
+	{								\
+		process_err = file_open("*stderr", "w");		\
+                return cmdline_driver(function, argc, argv);		\
+	}
+#endif	/* STANDALONE */
+
+#endif	/* !__CMDLINE_H_ */
+/*
+ * disasm.c
+ * Disassembler for virtual machine.
+ */
+
+#include "lib.h"
+#include "cmdline.h"
+
+#include "value.h"
+#include "chain.h"
+
+#include "file.h"
+#include "stream.h"
+#include "render.h"
+
+#include "report.h"
+#include "instrtab.h"
+#include "load.h"
+#include "portray.h"
+
+/* Routines */
+
+static void
+disassemble(struct reporter *r, struct process *p, struct value *code)
+{
+	struct opcode_entry *oe;
+	int count, pc, opcode;
+	struct value *val;
+	struct value t;
+	struct chain *back, *front;
+
+	value_integer_set(&t, 0);
+
+	/* first pass: find label destination */
+
+	pc = 0;
+	back = front = add_to_chain(NULL, &t);
+	for (;;) {
+		val = value_tuple_fetch(code, pc);
+		if (value_is_integer(val)) {
+			opcode = value_get_integer(val);
+			if (opcode < 0 || opcode > INSTR_EOF) {
+				report(r, REPORT_ERROR, "Opcode not in range 0..%d", INSTR_EOF);
+				pc++;
+			} else if (opcode == INSTR_EOF) {
+				break;
+			} else {
+				oe = &opcode_table[opcode];
+				count = oe->arity;
+				pc++;
+				val = value_tuple_fetch(code, pc);
+				while (count > 0) {
+					if (oe->optype == OPTYPE_ADDR)
+						back = add_to_chain(back, val);
+					pc++;
+					val = value_tuple_fetch(code, pc);
+					count--;
+				}
+			}
+		} else {
+			report(r, REPORT_ERROR, "Opcode is not an integer");
+			pc++;
+		}
+	}
+
+	if (reporter_has_errors(r))
+		return;
+
+	/* second pass */
+	pc = 0;
+	for (;;) {
+		val = value_tuple_fetch(code, pc);
+		opcode = value_get_integer(val);
+		if (opcode == INSTR_EOF)
+			break;
+
+		value_integer_set(&t, pc);
+		if (search_chain(front, &t) != NULL) {
+			process_render(p, ":L%d\n", pc);
+		}
+
+		oe = &opcode_table[opcode];
+		process_render(p, "%s ", oe->token);
+		count = oe->arity;
+
+		pc++;
+		val = value_tuple_fetch(code, pc);
+
+		if (count == -1) {
+			report(r, REPORT_WARNING, "Unrecognized opcode");
+			process_render(p, "??? ");
+		} else {
+			while (count > 0) {
+				if (oe->optype == OPTYPE_ADDR) {
+					process_render(p, ":L%d ",
+					    value_get_integer(val)
+					);
+				} else {
+					process_render(p, "#");
+					value_portray(p, val);
+				}
+				pc++;
+				val = value_tuple_fetch(code, pc);
+				count--;
+			}
+		}
+
+		process_render(p, "\n");
+	}
+
+	free_chain(front);
+}
+
+static void
+disassemble_main(struct value *args, struct value *result)
+{
+        struct reporter *r;
+	struct process *p;
+	struct value code;	/* virtual machine code we will dump */
+	struct value *asmfile, *vmfile;
+        struct value asmfile_sym, vmfile_sym;
+
+        value_symbol_new(&asmfile_sym, "asmfile", 7);
+        value_symbol_new(&vmfile_sym, "vmfile", 6);
+
+  	asmfile = value_dict_fetch(args, &asmfile_sym);
+	vmfile = value_dict_fetch(args, &vmfile_sym);
+
+	r = reporter_new("Disassembly", NULL, 1);
+
+	/*
+	 * Load.
+	 */
+	p = file_open(value_symbol_get_token(vmfile), "r");
+	value_load(&code, p);
+	stream_close(NULL, p);
+
+	p = file_open(value_symbol_get_token(asmfile), "w");
+	disassemble(r, p, &code);
+	stream_close(NULL, p);
+
+	value_integer_set(result, reporter_has_errors(r) ? 1 : 0);
+	reporter_free(r);
+}
+
+MAIN(disassemble_main)
+/*
+ * discern.c
+ * Routines for parsing values (only; not program source) from a file-like process.
+ * $Id: discern.c 143 2008-07-18 06:42:22Z catseye $
+ */
+
+#include "lib.h"
+
+#include "scan.h"
+#include "value.h"
+#include "discern.h"
+#include "chain.h"
+
+/*
+ * A parser for values, inspired somewhat by Scheme/LISP S-expressions
+ * and Prolog/Erlang terms.
+ */
+int
+value_discern(struct value *top, struct scanner *sc)
+{
+	if (scanner_tokeq(sc, "<")) {
+		struct chain *front, *back;
+		struct value tag, inner;
+		unsigned int size = 1;
+
+		/* Tuple. */
+		scanner_scan(sc);
+		value_discern(&tag, sc);
+		scanner_expect(sc, ":");
+		value_discern(&inner, sc);
+		back = front = add_to_chain(NULL, &inner);
+		while (scanner_tokeq(sc, ",")) {
+			scanner_scan(sc);
+			value_discern(&inner, sc);
+			back = add_to_chain(back, &inner);
+			size++;
+		}
+		scanner_expect(sc, ">");
+		value_tuple_new(top, &tag, size);
+		populate_tuple_from_chain(top, front);
+		free_chain(front);
+		return 1;
+	} else if (scanner_tokeq(sc, "[")) {
+                struct value inner, *left, right;
+
+		/* List: Sequence of tail-nested Pairs. */
+		scanner_scan(sc);
+		if (scanner_tokeq(sc, "]")) {
+			scanner_scan(sc);
+			value_copy(top, &VNULL);
+			return 1;
+		}
+		value_tuple_new(top, &tag_list, 2);
+		value_discern(&inner, sc);
+		value_tuple_store(top, 0, &inner);
+		/*value_tuple_store(top, 1, VNULL);*/
+		left = top;
+		while (scanner_tokeq(sc, ",")) {
+			scanner_scan(sc);
+			value_tuple_new(&right, &tag_list, 2);
+			value_discern(&inner, sc);
+			value_tuple_store(&right, 0, &inner);
+			/*value_set_index(right, 1, VNULL);*/
+			value_tuple_store(left, 1, &right);
+			left = value_tuple_fetch(left, 1);
+		}
+		if (scanner_tokeq(sc, "|")) {
+			scanner_scan(sc);
+			value_discern(&right, sc);
+			value_tuple_store(left, 1, &right);
+		}
+		scanner_expect(sc, "]");
+		return 1;
+	} else if (scanner_tokeq(sc, "{")) {
+		struct value left, right;
+
+		/* Dictionary: associations between keys and values. */
+		scanner_scan(sc);
+		value_dict_new(top, 31);
+
+		value_discern(&left, sc);
+		scanner_expect(sc, "=");
+		value_discern(&right, sc);
+		value_dict_store(top, &left, &right);
+		while (scanner_tokeq(sc, ",")) {
+			scanner_scan(sc);
+			value_discern(&left, sc);
+			scanner_expect(sc, "=");
+			value_discern(&right, sc);
+			value_dict_store(top, &left, &right);
+		}
+		scanner_expect(sc, "}");
+		return 1;
+	} else if (k_isdigit(scanner_token_string(sc)[0])) {
+		/* Integer. */
+		value_integer_set(top, k_atoi(scanner_token_string(sc),
+					      scanner_token_length(sc)));
+		scanner_scan(sc);
+		return 1;
+	} else {
+		/* Symbol. */
+		value_symbol_new(top, scanner_token_string(sc),
+				 scanner_token_length(sc));
+		scanner_scan(sc);
+		return 1;
+	}
+}
+/*
+ * discern.h
+ * Prototypes and structures for parsing values (only; not full
+ * program source code) from textual streams.
+ */
+
+#ifndef __DISCERN_H_
+#define __DISCERN_H_
+
+struct value;
+struct scanner;
+
+int value_discern(struct value *, struct scanner *);
+
+#endif /* !__DISCERN_H_ */
+/*
+ * file.c
+ * Native processes for communicating with (reading, writing) files,
+ * exposing a stream-like interface.
+ */
+
+#include <stdio.h>
+
+#include "lib.h"
+#include "process.h"
+#include "file.h"
+
+#ifdef DEBUG
+#include "stream.h"
+#include "portray.h"
+#include "cmdline.h"
+#endif
+
+static void run(struct process *p)
+{
+	struct value msg;
+	struct value response;
+	struct process *sender;
+	size_t result;
+	char *buffer;
+	size_t size;
+
+	assert(p != NULL);
+	assert(p->aux != NULL);
+
+	while (process_dequeue(p, &msg)) {
+#ifdef DEBUG
+/*
+		stream_write(NULL, process_err, "-->", 3);
+		value_portray(process_err, &msg);
+		stream_write(NULL, process_err, "<--", 3);
+*/
+#endif
+		if (value_is_tuple(&msg)) {
+			const char *tag = value_symbol_get_token(value_tuple_get_tag(&msg));
+		        if (strcmp(tag, "write") == 0) {
+				struct value *payload = value_tuple_fetch(&msg, 0);
+
+				result = fwrite(
+				    value_symbol_get_token(payload),
+				    value_symbol_get_length(payload),
+				    1, (FILE *)p->aux
+				);
+                        	if (result) {
+					/* some kind of error occurred */
+				}
+		        } else if (strcmp(tag, "read") == 0) {
+				sender = value_get_process(value_tuple_fetch(&msg, 0));
+				size = value_get_integer(value_tuple_fetch(&msg, 1));
+
+				buffer = value_symbol_new_buffer(&response, size);
+				result = fread(buffer, size, 1, (FILE *)p->aux);
+                        	if (result) {
+					/* some kind of error occurred, or we just need more */
+				}
+				process_enqueue(sender, &response);
+			} else if (strcmp(tag, "eof") == 0) {
+				sender = value_get_process(value_tuple_fetch(&msg, 0));
+
+				value_boolean_set(&response, feof((FILE *)p->aux));
+				process_enqueue(sender, &response);
+			} else if (strcmp(tag, "close") == 0) {
+				fclose((FILE *)p->aux);
+				p->aux = NULL;
+			}
+		}
+	}
+
+	p->waiting = 1;
+}
+
+static struct process *
+file_fopen(FILE *file)
+{
+	struct process *p;
+
+	p = process_new();
+	p->run = run;
+	p->aux = file;
+
+	return p;
+}
+
+struct process *
+file_open(const char *locator, const char *mode)
+{
+	FILE *f;
+
+	if (strcmp(locator, "*stdin") == 0) {
+		return file_fopen(stdin);
+	}
+	if (strcmp(locator, "*stdout") == 0) {
+		return file_fopen(stdout);
+	}
+	if (strcmp(locator, "*stderr") == 0) {
+		return file_fopen(stderr);
+	}
+
+	if ((f = fopen(locator, mode)) == NULL) {
+		/* XXX only if mode contains a 'strict' char flag... */
+		/* XXX this should use report.c somehow someday */
+		fprintf(stderr,
+		   "Could not open '%s' in mode '%s'", locator, mode);
+		exit(1);
+	}
+
+	return file_fopen(f);
+}
+/*
+ * file.h
+ * Native processes for communicating with (reading, writing) files,
+ * exposing a stream-like interface.
+ */
+
+#ifndef __FILE_H_
+#define __FILE_H_
+
+struct process;
+
+struct process	*file_open(const char *, const char *);
+
+#endif /* !__FILE_H_ */
+/*
+ * freeze.c
+ * Parse a constant term from a text file and
+ * write it out to a binary termfile.
+ * $Id: freeze.c 146 2008-12-06 19:59:54Z catseye $
+ */
+
+#include "lib.h"
+#include "cmdline.h"
+
+#include "stream.h"
+#include "file.h"
+
+#include "report.h"
+#include "value.h"
+
+#include "discern.h"
+#include "scan.h"
+#include "save.h"
+
+/* Main Program / Driver */
+
+static void
+freeze_main(struct value *args, struct value *result)
+{
+	struct process *p;
+	struct scanner *sc;
+        struct reporter *r;
+	struct value term;
+
+        struct value *termfile, *binfile;
+        struct value termfile_sym, binfile_sym;
+
+        value_symbol_new(&termfile_sym, "termfile", 8);
+        value_symbol_new(&binfile_sym, "binfile", 7);
+  	termfile = value_dict_fetch(args, &termfile_sym);
+	binfile = value_dict_fetch(args, &binfile_sym);
+
+	r = reporter_new("Freezing", NULL, 1);
+
+	/*
+	 * Parse.
+	 */
+	sc = scanner_new(r);
+	if (!scanner_open(sc, value_symbol_get_token(termfile))) {
+                value_integer_set(result, 1);
+                return;
+        }
+	if (!value_discern(&term, sc)) {
+		report(r, REPORT_ERROR, "Could not parse input term");
+        }
+
+	/*
+	 * Write out.
+	 */
+	p = file_open(value_symbol_get_token(binfile), "w");
+	if (!value_save(p, &term)) {
+		report(r, REPORT_ERROR,
+		   "Could not write term to '%s'", binfile);
+	}
+	stream_close(NULL, p);
+
+	/*
+	 * Finish up.
+	 */
+	scanner_close(sc);
+	scanner_free(sc);
+
+	value_integer_set(result, reporter_has_errors(r) ? 1 : 0);
+	reporter_free(r);
+}
+
+MAIN(freeze_main)
+/*
+ * gen.c
+ * Accumulate values in tuples with extents.
+ */
+
+#include "lib.h"
+
+#include "gen.h"
+
+#ifdef DEBUG
+#include "cmdline.h"
+#include "portray.h"
+#include "render.h"
+#endif
+
+/*
+ * This was originally intended to generate VM instructions, however
+ * it has more general applicability.
+ */
+
+#define GEN_ROOT_TUPLE		0 /* The top tuple of the hierarchy */
+#define GEN_CURRENT_TUPLE	1 /* The tuple currently being gen'ed into */
+#define GEN_GLOBAL_POS		2 /* The global position to be gen'ed into */
+#define GEN_LOCAL_POS		3 /* Gen offset into current tuple */
+
+#define GEN_SIZE		4
+
+int
+gen_new(struct value *gen, struct value *tuple, unsigned int pos)
+{
+	struct value gen_tag;
+
+	value_symbol_new(&gen_tag, "gen", 3);
+	if (!value_tuple_new(gen, &gen_tag, GEN_SIZE))
+		return 0;
+	value_tuple_store(gen, GEN_ROOT_TUPLE, tuple);
+	value_tuple_store(gen, GEN_CURRENT_TUPLE, tuple);
+	value_tuple_store_integer(gen, GEN_GLOBAL_POS, pos);
+	value_tuple_store_integer(gen, GEN_LOCAL_POS, pos);
+	return 1;
+}
+
+int
+gen_new_default(struct value *gen)
+{
+	struct value tuple, code_sym;
+
+	value_symbol_new(&code_sym, "code", 4);
+	value_tuple_new(&tuple, &code_sym, 8192);
+	return gen_new(gen, &tuple, 0);
+}
+
+/*
+ * Accumulate a value to a tuple.
+ */
+void
+gen_value(struct value *gen, struct value *value)
+{
+	unsigned int pos = value_tuple_fetch_integer(gen, GEN_LOCAL_POS);
+	struct value *tuple = value_tuple_fetch(gen, GEN_CURRENT_TUPLE);
+	unsigned int size = value_tuple_get_size(tuple);
+	struct value new_tuple;
+
+	if (pos == (size - 1)) {
+		/* Last slot.  Allocate a new tuple of twice the running size and plug it in. */
+		struct value *tag = value_tuple_get_tag(tuple);
+
+		value_tuple_new(&new_tuple, tag, size * 2);
+		value_tuple_store(tuple, pos, &new_tuple);
+		value_tuple_store(gen, GEN_CURRENT_TUPLE, &new_tuple);
+		pos = 0;
+		tuple = &new_tuple;
+	}
+	value_tuple_store(tuple, pos, value);
+	pos++;
+	value_tuple_store_integer(gen, GEN_LOCAL_POS, pos);
+
+	pos = value_tuple_fetch_integer(gen, GEN_GLOBAL_POS);
+	pos++;
+	value_tuple_store_integer(gen, GEN_GLOBAL_POS, pos);
+}
+
+/*
+ * Accumulate a value to a tuple.
+ */
+void
+gen_integer(struct value *gen, int i)
+{
+	struct value val;
+
+	value_integer_set(&val, i);
+	gen_value(gen, &val);
+}
+
+/*
+ * Labels - forward reference and backpatching mechanism.
+ * This is designed to be used internally.  External clients
+ * (that is, where labels must be human-readable, such as the
+ * assembler) should maintain a dictionary associating strings
+ * with labels.
+ *
+ * Use cases:
+ *
+ * 1) Backward reference (requires no backpatching):
+ *
+ *      label = gen_define_label(gen, NULL);
+ *      ...
+ *      gen_integer(gen, INSTR_GOTO);
+ *      gen_gen_label_ref(gen, label);
+ *
+ * 2) Forward reference (requires backpatching):
+ *
+ *      gen_integer(gen, INSTR_GOTO);
+ *      label = gen_gen_label_ref(gen, NULL);
+ *      ...
+ *      gen_define_label(gen, label);
+ *
+ */
+
+#define GEN_LABEL_TUPLE		0 /* Tuple into which the label points */
+#define GEN_LABEL_LOCAL_POS	1 /* local pos to which label refers */
+#define GEN_LABEL_GLOBAL_POS	2 /* global pos to which label refers */
+#define GEN_LABEL_NEXT		3 /* list of backpatches to apply */
+
+#define GEN_LABEL_SIZE		4
+
+static int
+gen_label_new(struct value *gen_label)
+{
+	struct value gen_label_tag;
+
+	value_symbol_new(&gen_label_tag, "genlab", 3);
+	if (!value_tuple_new(gen_label, &gen_label_tag, GEN_LABEL_SIZE))
+		return 0;
+	value_tuple_store(gen_label, GEN_LABEL_TUPLE, &VNULL);
+	value_tuple_store_integer(gen_label, GEN_LABEL_LOCAL_POS, 0);
+	value_tuple_store_integer(gen_label, GEN_LABEL_GLOBAL_POS, 0);
+	value_tuple_store(gen_label, GEN_LABEL_NEXT, &VNULL);
+
+	return 1;
+}
+
+/*
+ * Generate a reference to a label into the tuple, for instance as the
+ * immediate argument of a branch instruction.  If the label parameter
+ * is NULL, this will generate and return a forward reference which
+ * should be resolved by subsequently passing it to gen_define_label().
+ */
+void
+gen_gen_label_ref(struct value *gen, struct value *gen_label)
+{
+	struct value *bp;
+	struct value next;
+	int global_pos;
+
+	assert(gen_label != NULL);
+
+	if (value_is_null(gen_label)) {
+		/* Not yet allocated, so allocate a new undefined one. */
+		gen_label_new(gen_label);
+	}
+
+	global_pos = value_tuple_fetch_integer(gen_label, GEN_LABEL_GLOBAL_POS);
+	if (global_pos != 0) {
+		/* Already defined, so just use it. */
+		gen_integer(gen, global_pos);
+		return;
+	}
+
+	/*
+	 * The label is newly allocated, or at least has not been defined
+	 * yet.  So, we remember that we will need to backpatch here in the
+	 * future (by adding an entry to the label's backpatch list) and,
+	 * for now, generate a NULL in its slot.
+	 */
+
+	value_copy(&next, value_tuple_fetch(gen_label, GEN_LABEL_NEXT));
+	bp = value_tuple_fetch(gen_label, GEN_LABEL_NEXT);
+	gen_label_new(bp);
+	value_tuple_store(bp, GEN_LABEL_TUPLE, value_tuple_fetch(gen, GEN_CURRENT_TUPLE));
+	value_tuple_store(bp, GEN_LABEL_LOCAL_POS, value_tuple_fetch(gen, GEN_LOCAL_POS));
+	value_tuple_store(bp, GEN_LABEL_GLOBAL_POS, value_tuple_fetch(gen, GEN_GLOBAL_POS));
+	value_tuple_store(bp, GEN_LABEL_NEXT, &next);
+
+	gen_value(gen, &VNULL);
+}
+
+/*
+ * Define a label.  If the label is already allocated, this will
+ * cause previously-generated forward references to this label to be
+ * backpatched with the now-known location.
+ */
+int
+gen_define_label(struct value *gen, struct value *gen_label)
+{
+	struct value *tuple = value_tuple_fetch(gen, GEN_CURRENT_TUPLE);
+	unsigned int local_pos = value_tuple_fetch_integer(gen, GEN_LOCAL_POS);
+	unsigned int global_pos = value_tuple_fetch_integer(gen, GEN_GLOBAL_POS);
+
+	struct value *bp;
+
+	assert(gen_label != NULL);
+
+	if (value_is_null(gen_label)) {
+		/* Not yet allocated, so allocate a new undefined one. */
+		gen_label_new(gen_label);
+	} else {
+		/* Fail if we try to redefine a label. */
+		if (value_tuple_fetch_integer(gen_label, GEN_LABEL_GLOBAL_POS) != 0) {
+			return 0;
+		}
+	}
+
+	value_tuple_store(gen_label, GEN_LABEL_TUPLE, tuple);
+	value_tuple_store_integer(gen_label, GEN_LABEL_LOCAL_POS, local_pos);
+	value_tuple_store_integer(gen_label, GEN_LABEL_GLOBAL_POS, global_pos);
+
+	/*
+	 * Resolve any previous forward references by backpatching.
+	 */
+	bp = value_tuple_fetch(gen_label, GEN_LABEL_NEXT);
+	while (!value_is_null(bp)) {
+		/*
+		 * Backpatch, by placing the current global position into
+		 * the slot named by the bp entry, then remove entry from list.
+		 */
+		value_tuple_store_integer(
+		    value_tuple_fetch(bp, GEN_LABEL_TUPLE),
+		    value_tuple_fetch_integer(bp, GEN_LABEL_LOCAL_POS),
+		    global_pos
+		);
+		value_tuple_store(gen_label, GEN_LABEL_NEXT,
+		    value_tuple_fetch(bp, GEN_LABEL_NEXT));
+		bp = value_tuple_fetch(gen_label, GEN_LABEL_NEXT);
+	}
+
+	return 1;
+}
+
+/*
+ * Return a single tuple containing the contents of the gen'ed tuple chain.
+ */
+void
+gen_flatten(struct value *gen, struct value *dest)
+{
+	unsigned int dest_size = value_tuple_fetch_integer(gen, GEN_GLOBAL_POS);
+	struct value *current = value_tuple_fetch(gen, GEN_ROOT_TUPLE);
+	unsigned int j = 0;
+
+	value_tuple_new(dest, value_tuple_get_tag(current), dest_size);
+
+	while (!value_is_null(current)) {
+		unsigned int i = 0;
+		unsigned int current_size = value_tuple_get_size(current);
+		struct value *x;
+		for (i = 0; i < (current_size - 1) && i < dest_size; i++) {
+			x = value_tuple_fetch(current, i);
+
+#ifdef DEBUG
+			process_render(process_err, "(flatten %d/%d -> %d/%d: ", i, current_size, j, dest_size);
+			value_portray(process_err, x);
+			process_render(process_err, ")\n");
+#endif
+
+			value_tuple_store(dest, j, x);
+			j++;
+		}
+		current = value_tuple_fetch(current, (current_size - 1));
+	}
+}
+/*
+ * gen.h
+ * Accumulate values in tuples with extents.
+ * $Id: gen.h 144 2008-07-19 01:57:04Z catseye $
+ */
+
+#ifndef __GEN_H_
+#define __GEN_H_
+
+#include "value.h"
+
+int			 gen_new(struct value *, struct value *, unsigned int);
+int			 gen_new_default(struct value *);
+void			 gen_value(struct value *, struct value *);
+void			 gen_integer(struct value *, int);
+int			 gen_define_label(struct value *, struct value *);
+void			 gen_gen_label_ref(struct value *, struct value *);
+void			 gen_flatten(struct value *, struct value *);
+
+#endif /* !__GEN_H_ */
+/*
+ * geninstr.c
+ */
+
+#include <ctype.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+static int
+parse_descriptor_line(const char *line, char *name, char *mode)
+{
+        while (isspace((int)*line) && (*line != '\0')) {
+                line++;
+        }
+        if (*line != '%')
+                return 0;
+        line++;
+        while (isspace((int)*line) && (*line != '\0')) {
+                line++;
+        }
+        while (!isspace((int)*line) && (*line != '\0')) {
+                *name = *line;
+                name++;
+                line++;
+        }
+        *name = '\0';
+        while (isspace((int)*line) && (*line != '\0')) {
+                line++;
+        }
+        while (!isspace((int)*line) && (*line != '\0')) {
+                *mode = *line;
+                mode++;
+                line++;
+        }
+        *mode = '\0';
+        return 1;
+}
+
+int
+main(int argc, char **argv)
+{
+        FILE *instrtab, *instrenum, *instrlab, *vm, *localtypes;
+        static char line[512], name[80], mode[80];
+
+        argc = argc;
+        argv = argv;
+
+        if ((instrtab = fopen("instrtab.c", "w")) == NULL) {
+            perror("Couldn't open instrtab.c for writing");
+            exit(1);
+        }
+        if ((instrenum = fopen("instrenum.h", "w")) == NULL) {
+            perror("Couldn't open instrenum.h for writing");
+            exit(1);
+        }
+        if ((instrlab = fopen("instrlab.h", "w")) == NULL) {
+            perror("Couldn't open instrlab.h for writing");
+            exit(1);
+        }
+        if ((vm = fopen("vm.c", "r")) == NULL) {
+            perror("Couldn't open vm.c for reading");
+            exit(1);
+        }
+	fputs(
+"/*\n"
+" * instrtab.c\n"
+" * Table of VM instructions, mapping names to opcodes.\n"
+" * NOTE: THIS FILE WAS AUTOMATICALLY GENERATED from vm.c by geninstr\n"
+" */\n"
+"\n"
+"#include \"lib.h\"\n"
+"\n"
+"#include \"instrtab.h\"\n"
+"\n"
+"struct opcode_entry opcode_table[] = {\n", instrtab);
+	fputs(
+"/*\n"
+" * instrenum.h\n"
+" * C 'enum' type for VM instructions.\n"
+" * NOTE: THIS FILE WAS AUTOMATICALLY GENERATED from vm.c by geninstr\n"
+" */\n"
+"\n"
+"#ifndef __INSTRENUM_H_\n"
+"#define __INSTRENUM_H_\n"
+"\n"
+"enum opcode {\n", instrenum);
+	fputs(
+"/*\n"
+" * instrlab.h\n"
+" * gcc labels for VM instructions, to support direct threading.\n"
+" * NOTE: THIS FILE WAS AUTOMATICALLY GENERATED from vm.c by geninstr\n"
+" */\n"
+"\n"
+"#ifndef __INSTRLAB_H_\n"
+"#define __INSTRLAB_H_\n"
+"\n"
+"static clabel instr_label[] = {\n", instrlab);
+
+        while (fgets(line, 510, vm)) {
+                int arity = 1;
+                const char *optype = "NONE";
+
+                if (!parse_descriptor_line(line, name, mode))
+                        continue;
+
+                switch (mode[0]) {
+                    case 'i':
+                        optype = "INT";
+                        break;
+                    case 'a':
+                        optype = "ADDR";
+                        break;
+                    case 'v':
+                        optype = "VALUE";
+                        break;
+                    default:
+                        arity = 0;
+                        break;
+                }
+
+                fprintf(instrtab, "\t{ \"%s\",\tINSTR_%s,\t%d,\tOPTYPE_%s\t},\n",
+                        name, name, arity, optype);
+                fprintf(instrenum, "\tINSTR_%s,\n", name);
+                fprintf(instrlab, "\t&&LABEL_INSTR_%s,\n", name);
+        }
+        fclose(vm);
+        fputs("\t{ NULL,\t\tINSTR_NULL,\t0,\tOPTYPE_NONE }\n};\n", instrtab);
+        fclose(instrtab);
+        fputs("\tINSTR_NULL\n};\n\n#endif /* !__INSTRENUM_H_ */\n", instrenum);
+        fclose(instrenum);
+        fputs("\tNULL\n};\n\n#endif /* !__INSTRLAB_H_ */\n", instrlab);
+        fclose(instrlab);
+
+        if ((localtypes = fopen("localtypes.h", "w")) == NULL) {
+            perror("Couldn't open localtypes.h for writing");
+            exit(1);
+        }
+	fputs(
+"/*\n"
+" * localtypes.h\n"
+" * Define some types for the local system.\n"
+" * NOTE: THIS FILE WAS AUTOMATICALLY GENERATED by geninstr\n"
+" */\n"
+"\n"
+"#ifndef __LOCALTYPES_H_\n"
+"#define __LOCALTYPES_H_\n"
+"\n", localtypes);
+        if (sizeof(void *) == sizeof(unsigned int)) {
+                fputs("#define PTR_INT unsigned int\n", localtypes);
+        } else if (sizeof(void *) == sizeof(unsigned long int)) {
+                fputs("#define PTR_INT unsigned long int\n", localtypes);
+        } else {
+                fputs("#define PTR_INT not_supported\n", localtypes);
+        }
+	fputs("\n#endif /* !__LOCALTYPES_H_ */\n", localtypes);
+        fclose(localtypes);
+
+        exit(0);
+}
+/*
+ * instrtab.h
+ * Table of VM instructions, mapping names to opcodes.
+ * $Id: instrtab.h 95 2006-02-14 01:33:18Z catseye $
+ */
+
+#ifndef __INSTRTAB_H_
+#define __INSTRTAB_H_
+
+#include "instrenum.h"
+
+enum optype {
+	OPTYPE_NONE,
+	OPTYPE_INT,
+	OPTYPE_ADDR,
+	OPTYPE_VALUE
+};
+
+struct opcode_entry {
+	const char	*token;
+	enum opcode	 opcode;
+	int		 arity;
+	enum optype	 optype;
+};
+
+extern struct opcode_entry opcode_table[];
+
+#endif /* !__INSTRTAB_H_ */
+/*
+ * lib.c
+ * Common functions.
+ * $Id: lib.c 114 2007-02-27 02:09:02Z catseye $
+ */
+
+#include "lib.h"
+
+#ifdef STANDALONE
+/*
+ * To quote the FreeBSD manpage (despite that this is a reimplementation):
+ * The strncpy() function copies not more than len characters from src into
+ * dst, appending `\0' characters if src is less than len characters long,
+ * and not terminating dst otherwise.
+ */
+#ifndef USE_SYSTEM_STRNCPY
+char *
+strncpy(char *dst, const char *src, unsigned int len)
+{
+	char *r = dst;
+
+	while (*src != '\0' && len > 0) {
+		*dst = *src;
+		dst++;
+		src++;
+		len--;
+	}
+	while (len > 0) {
+		/* append \0 characters */
+		*dst = '\0';
+		dst++;
+		len--;
+	}
+
+	return r;
+}
+#endif /* !USE_SYSTEM_STRNCPY */
+
+int
+k_isspace(char c)
+{
+	return c == ' ' || c == '\t' || c == '\n' || c == '\r';
+}
+
+int
+k_isdigit(char c)
+{
+	return c >= '0' && c <= '9';
+}
+
+int
+k_isalpha(char c)
+{
+	return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
+}
+
+#ifndef USE_SYSTEM_STRCMP
+int
+strcmp(const char *s1, const char *s2)
+{
+	while (*s1 == *s2 && *s1 != '\0' && *s2 != '\0') {
+		s1++;
+		s2++;
+	}
+	if (*s1 == '\0' && *s2 == '\0') {
+		return 0;
+	}
+	if (*s1 > *s2) {
+		return 1;
+	} else {
+		return -1;
+	}
+}
+#endif /* !USE_SYSTEM_STRCMP */
+
+#endif /* STANDALONE */
+
+int
+k_atoi(const char *s, unsigned int len)
+{
+	int acc = 0;
+
+	while (k_isspace(*s) && len > 0) {
+		s++;
+		len--;
+	}
+	while (k_isdigit(*s) && len > 0) {
+		acc = acc * 10 + (*s - '0');
+		s++;
+		len--;
+	}
+
+	return acc;
+}
+/*
+ * lib.h
+ * Common definitions and function prototypes.
+ * $Id: lib.h 145 2008-11-26 10:05:16Z catseye $
+ */
+
+#ifndef	__LIB_H_
+#define	__LIB_H_
+
+#ifndef NULL
+#define	NULL	0
+#endif
+
+#ifndef NDEBUG
+#include <stdio.h>
+#define	assert(cond) {							\
+	if (!(cond)) {							\
+		fprintf(stderr,						\
+		   "assertion failed (" __FILE__ "): " #cond "\n");	\
+		abort();						\
+	}								\
+}
+#else
+#define	assert(cond)
+#endif
+
+#ifdef STANDALONE
+/* stdlib.h */
+void	*malloc(unsigned int);
+void	 free(void *);
+void	 exit(int);
+void	 abort(void);
+
+/* string.h */
+int	 strcmp(const char *, const char *);
+int	 strncmp(const char *, const char *, unsigned int);
+char	*strncpy(char *, const char *, unsigned int);
+int      strlen(const char *);
+void	*memset(void *, int, unsigned int);
+
+/* ctype.h */
+int	 k_isspace(char);
+int	 k_isdigit(char);
+int	 k_isalpha(char);
+#else
+#include <stdlib.h>
+#include <string.h>
+#include <ctype.h>
+#define k_isspace(x) isspace((int)x)
+#define k_isdigit(x) isdigit((int)x)
+#define k_isalpha(x) isalpha((int)x)
+#endif
+
+int	 k_atoi(const char *, unsigned int);
+
+#endif	/* !__LIB_H_ */
+/*
+ * load.c
+ * Load values from a stream-like process.
+ */
+
+#include "lib.h"
+
+#include "load.h"
+#include "value.h"
+#include "stream.h"
+