1. jordan Earls
  2. LightVM



Light VM stuff


  • Minimalistic assembler (bootstrap and then eventually implemented in LightVM?)
  • LightVM machine. A basic byte-by-byte interpreter in C


I designed my build system to be as easy and portable as possible. You can edit the CFLAGS variable in Makefile if needed for your architecture. Beyond that, you can build it with

$ make

And run unit tests with

$ make test

There is no install functionality and most likely never will be. Also, I despise ./configure scripts, so I assume you know enough about your platform to manually modify Makefile if needed


LightVM is a register based machine. It's a simple 16-bit machine with a 16-bit memory address space. So, 64K is currently it's memory limit.

LightVM is in no way "isolated". It will make basic error checks, but is in no way guaranteed to be secure. It DOES support dynamic code generation just like x86.

It is a von-neumann architecture rather than a havard architecture. This means that code IS data and vice-versa.

Each register is 16-bits. There are 16 registers, but registers really only function as a short cut into memory for keeping code short. Most registers are general purpose. There are some exceptions though

  • TR -- Truth Register. Used for conditional branches
  • SR -- Stack Register. Used for pushing and popping to the stack
  • IP -- Instruction Pointer. Controls where the VM is currently reading instructions from
  • CR -- Compare Register. Used for comparisons

Oddly enough, all registers are readable and writeable. This has the potential for some interesting "instructions"

Instructions are sort of fixed-width. They are 2 bytes usually. When an immediate is specified, then it is 4 bytes.

There are two primary modes of operation of the VM. 16-bit mode and 8-bit mode. Obviously some things are not intuitive in 8-bit mode though (immediates are still 2 bytes, so it's wasteful).

Opcode Architecture

In order to make implementation easier and smaller, I went with a RISC-ish architecture. It's not truly RISC, but it has a very terse standardized format for every opcode with a total of 3 special case opcodes that deviate from that format. Because of this, I can very easily have a "fetch" pipeline followed by an "operation" opcode. The length of the opcode and the argument(s) to the opcode can be gathered by a very quick scan through the top 4 bits of the opcode and the last 8 bits, mostly.

This standardized format of course has some interesting edge cases. For instance, it's possible to do mv r0, [100], but not possible to do mv [100], r0 because of the way immediates are indicated in an opcode.

I also avoided the design decision to make every opcode conditional. I figure a (usually) pointless branch on each opcode was wasteful.


The opcodes are very "pure". There is no flags register, so nothing other than the arguments you specify (or that are implied) are modified. As a result of no flags register, comparisons are very different. More detail can be gleemed from docs/compares.md

With the way I designed comparisons, instead of doing things like branch if carry is set, I went the higher level approach like set TR if CR is greater than X. Because I also support or and and attached to comparisons, this makes range comparisons extremely trivial.


Primary RAM can not be added or deleted and must be one continous block of memory. Obviously, this is the fastest RAM available.

VM memory can also be "added", but it is considered "extended" because it will be horribly slow due to having to scan through a list of extended memory for each operation

It's an option you can take, but it is faster to shovel data into general RAM if you need external DMA-like functionality.


Concurrency is currently a big pipe-dream. Multiple VM instances should not interfere with each other, BUT, a single VM having it's memory or some such modified across threads will probably break horribly.

Register Considerations

Actually, Registers are exactly 16-bits. However, memory up to 64-bytes is reserved, which happens to fit 2 banks of registers. This may be exploited in the future


It should have minimal libc requirements. Basically, you hand it a block of memory with the bytecode and it executes. There will also be a callback system for "breaking out" of the VM and into the actual machine to do things like print text

A useful program should be easily implemented in 1-2K of memory.

It will assume it is running on a little-endian architecture

No floating point or other special math support is planned at this point.

This VM is designed to be easily embedded and as portable as possible outside ofthe endianness issue

Again, this is currently designed to target very small devices. Primary targets at this point:

  • Mbed ARM boards (32K to 8K of RAM)
  • Competent AVR microcontrollors(at least 1K of RAM?)
  • Standard x86 PCs (for testing)


Everything is under the BSD 3-clause license, which boils down to you can use it for anything you want, but you can't claim you wrote it.