Sat Jan 16 12:17:05 GMT 2010 Jon Harrop <firstname.lastname@example.org>
* Disabling TCO now uses a separate set of tests that do not require
TCO so they should all run.
* Applied bluestorm's patches for Makefile and camlp4 macros to make it
easier to write HLVM IR.
Sun Jan 10 16:29:16 GMT 2010 Jon Harrop <email@example.com>
* Removed the use of AddressOf in the hash table test.
* Added --no-stack-handler option so you can remove the libsigsegv
handler that is used to detect stack overflows.
* Convert some remaining uses of GetValue in the GC to use the abstract
* Fixed a bug where Alloc wasn't initializing mark state so it could
already be marked when the mark phase starts and its children could go
unmarked and get reclaimed despite being reachable.
Thu Jan 7 15:18:59 GMT 2010 Jon Harrop <firstname.lastname@example.org>
* Added a `Byte type to HLVM to make strings easier to use.
* Ported the ray tracer benchmark from OCaml to HLVM. HLVM is
significantly faster despite the use of the FFI to do sqrt.
* Made runtime.cpp 64-bit happy (int -> long).
Mon Jan 4 07:49:18 GMT 2010 Jon Harrop <email@example.com>
* Inlined get_stack.
* Changed the value "unit" to an undef rather than "int 0".
* Fixed unnecessary allocation of mark state for empty (NULL) arrays:
parallel n-queens is ~3% faster.
* Fixed the calculator example.
Fri Jan 1 20:57:09 GMT 2010 Jon Harrop <firstname.lastname@example.org>
* Optimized thread-local data to be passed by pointer in function
arguments, dramatically improving performance.
* Removed some concurrency bugs in the GC.
* Altered the parallel n-queens test to recursively spawn child
* Implemented ptr->size map in runtime.cpp to check for double frees
when runtime is in debugging mode.
Mon Nov 30 01:14:49 GMT 2009 Jon Harrop <email@example.com>
* First working version of threads and stop-the-world GC. Only took 5
days to implement!
* Global list of active threads and GC state (running or stopping the
* Thread local state keeps thread state (running or suspended) and
* First thread to hit allocation quota starts a GC by suspending all
other threads and performing the same mark+sweep algorithm, albeit
using all shadow stacks, before resuming all other threads. The other
threads spin while the GC is running.
* Currently, every function checks the global state on entry to see if
it needs to suspend itself. This is contention for the same lock from
every function call on every thread which is very wasteful and is the
main reason why HLVM is now 3x slower than it was. However, it is
easily remedied by removing checks from leaf functions (that make no
calls) and using a counter or even measuring elapsed real time to
check less often in non-leaf functions.
* Only non-trivial threading aspect is to enter and leave blocking
sections when users perform blocking operations like taking out locks
of their own. This ensures that one thread cannot start a GC cycle with
a lock taken out while another thread is waiting on that lock, which
would deadlock all threads.
Thu Nov 12 18:08:41 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* Time spent in mark and sweep phases is accumulated per expression
* Benchmarks are now warmed up to remove JIT time.
* New bubblesort benchmark.
* More fine-grained and intelligent loop unrolling.
Sun Oct 25 22:40:03 GMT 2009 Jon Harrop <email@example.com>
* Updated to work with LLVM 2.6: now uses llcontext and initializes
targets before creating a JIT.
* Replaced the garbage collector with a more conventional mark-sweep
collector that stores mark state in the heap.
Sat Sep 26 22:51:59 BST 2009 Jon Harrop <firstname.lastname@example.org>
* Array allocation now accepts the initial element value.
* The second iteration of the compiler is now in SVN.
Thu Jun 25 02:36:38 BST 2009 Jon Harrop <email@example.com>
* Improved parsing in the example compiler to handle line numbers and
* Improved pattern matching in the example compiler to allow tuples to
be deconstructed in "let .. in" bindings as well as function
Sun Jun 21 20:04:48 BST 2009 Jon Harrop <firstname.lastname@example.org>
* Added two examples: a trivial calculator and a rudimentary compiler
with support for recursive function definitions with explicit types and
unit, char, int, float and tuple types. Arrays will require
* Note: closures that do not require cycles can be represented entirely
Sat Mar 21 17:42:31 GMT 2009 Jon Harrop <email@example.com>
* Memoized constant string literals.
* Fixed perf bug where all bound variables were being pushed onto the
shadow stack at the beginning of every function body instead of just
the function's arguments.
* Optimized array visit function to skip arrays that do not contain
any reference types.
* Implemented standalone compilation. All JIT executed functions are
called from a "main" function in the output aout.bc which the ./opt.sh
script compiles through LLVM's optimization passes into aoutopt. This
gives us some idea of how much performance would be gained by applying
LLVM's optimization passes if we had bindings for them.
* Moved timing code from the compiler into the generated code so our
standalone executable can be used for benchmarking.
Fri Mar 13 02:42:33 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* Switched from LLVM's malloc intrinsic to helper functions for alloc
(using calloc in order to initialize references to NULL) and free.
* Optimized the GC to use a hash table instead of linear search,
improving performance by up to 100x on our benchmarks.
* Avoided resetting the shadow stack at the ends of functions that do
not push any roots onto it.
Wed Mar 11 21:05:44 GMT 2009 Jon Harrop <email@example.com>
* Chose to put new LLVM bindings into the HLVM project instead of
patching LLVM. We now have bindings for BuildExtractValue,
BuildInsertValue and for enabling tail call optimization globally.
Wed Mar 11 04:01:45 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* Tidied the run-time type code to reuse HLVM's own structs.
* Fixed a perf bug where reading a reference type from a bound
variable was pushing it onto the shadow stack. Sieve is now 70% faster.
* Fixed a bug in init_type which was invoking a fastcc function when it
should have been using the C calling convention.
* Added dummy arguments to all argumentless invocations to work around
a bug in LLVM's OCaml bindings, which calls malloc for zero bytes
(which is illegal) if there are zero arguments.
Mon Mar 9 12:31:16 GMT 2009 Jon Harrop <email@example.com>
* Fixed the GC to use an explicit "visit stack" instead of the system
stack. This eliminates the stack overflow bug that prevented the list
test from running to completion but the current GC is so slow that
I still have not run this test to completion.
Sun Mar 8 04:23:30 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* Prepared for first public release.
Fri Mar 6 01:56:13 GMT 2009 Jon Harrop <email@example.com>
* Made the "expr" and "def" functions mutually recursive so the
compilation of an expression can trigger the compilation of an
auxiliary function that it depends upon.
* Implemented a simple stop-the-world mark and sweep garbage collector
written in the HLVM language itself.
* A new `UnsafeFunction top-level statement defines a function without
injecting any code for the GC.
* Ordinary functions call gc_root to register a local root into the
heap (e.g. when loading a reference type) by pushing it onto the shadow
stack, call gc_alloc when a value is allocated to push it onto the
allocated list and gc_restore when a function returns to unwind the
Tue Mar 3 02:38:35 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* Realised that the current functionality can already be used to
represent closures as a function pointer and a reference type
containing the environment. Added some tests that use this.
Wed Feb 25 21:20:26 GMT 2009 Jon Harrop <email@example.com>
* Made ffib benchmark 50% faster by swapping the branches of the "if"
expression: most common branch should go first.
* Added code to Printf to handle 32-bit floats by extending them to
64-bit floats for printing.
Wed Feb 25 03:22:42 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* Added OCaml bindings to the new ExtractElement and InsertElement
instructions now in LLVM's C bindings. This has improved performance
a lot: Mandelbrot3 (with complex arithmetic) is now 3.5x faster and
HLVM is faster than OCaml on every test.
Sun Feb 22 16:14:10 GMT 2009 Jon Harrop <email@example.com>
* First implementation of reference types represented as a struct
containing a pointer to the type and a pointer to the argument (if
any). This allows typed NULL references which should permit efficient
implementation of properly-typed linked lists.
* Returning first-class structs inhibits TCO so we must sret all struct
return types, which I have now done.
* Removed a layer of indirection in the representation of arrays, which
is now a struct containing the type, length and pointer to heap data.
* Fixed with respect to all existing tests. Array performance has
degraded slightly due to the current overhead of structs.
Thu Feb 19 23:23:25 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* build_allocas must go in the entry block of a function. Altered
state#alloca to insert at the end of the entry block and tweaked the
compilation of "if" expressions to insert the final jump from the
previous block only after compilation of both branches is complete, so
they insert their allocas before the branch.
* Lots of optimization passes are not yet available from OCaml but
they could do a great deal with our code, particularly our inefficient
* Reference type is a struct with a pointer to the type and a pointer
to the argument. That allows NULL to be typed.
Fri Feb 6 01:39:13 GMT 2009 Jon Harrop <email@example.com>
* Added structs but they are locally allocated so they cannot be
passed in the arguments to a tail call.
* Used an ingenious workaround: copy every struct into an alloca in
order to use gep until extractvalue is implemented.
* Fixed a bug caused by marking every call as a tail call when this
can confuse LLVM's (CSE) optimization pass.
Tue Jan 27 00:47:10 GMT 2009 Jon Harrop <firstname.lastname@example.org>
* Extended sieve and mandelbrot examples to illustrate segmentation
fault due to stack overflow due to non-tail calls.
* Introduced a Return constructor into the Expr.t type to handle tail
* Added first-class function pointers by casting all defined functions
into pointer types before binding them to their variable name.
Sun Jan 25 06:51:00 GMT 2009 Jon Harrop <email@example.com>
* Clean design with "expr" checking all types including return types.
* FastCC but no propagation of tail calls into final expressions.
* No extern functions.
* No first-class function pointers.
* No run-time types.
* No shadow stack.
* No tuples or multiple return values: functions accept multiple
arguments but return only one result.