(* DO NOT EDIT (digest: 377865a37801d208903e6edbddfec8ad) *)
This is the README file for the mtsim distribution.

Simulator of multithreaded algorithms to prove their correctness by
bruteforce over preemption points

See the files INSTALL.txt for building and installation instructions. 


    Why not Spin < >?

  Author of mtsim needed a flexible tool that can be tweaked to his needs.
He have looked at Spin sources first (C, large enough code base)...
  Now mtsim core consists of about 1100 lines of OCaml code in sparse style.

    General concepts.

  Writing correct multithreaded code with shared mutable state is HARD.
Avoid it if you can.  There are better alternatives for most of cases,
such as message passing.  However, if you MUST use threads, semaphores
and so on, mtsim library can help you.  It can check your algorithms
either exhaustively (brute-forcing over preemption points) or in a random

  Algorithms are represented as lists of instructions (to load variables
from shared memory to local thread stack and store them back, to wait/signal
semaphore and so on).  Instructions are executed in order of presence in list.
Control structures "if" and "while" can be used for implementing non-linear
control flow.

  Algorithms are lists, so usual list functions are useful for writing
algorithms in DSL, especially useful is "@" (List.append), it's like a ";"
separator seen in imperative code.

  Modifying variable (when new value depends on old value) is a
two-instructions process: 1. the variable is loaded from shared memory into
thread local stack, 2. stack top is modified and written back to shared memory.

  It's possible to check some conditions on each simulation step and to check
conditions on successful termination of all simulated threads.

  It's usually impossible/inconvenient to write all program logic in DSL
to simulate it, so it's recommended to write just a skeleton of program,
responsible for multithreaded part.  Skipped parts can be represented by
named "sections", and simulator can check that some sections can't ever
run simultaneously.  (For example, in readers-writer lock we can check
that there can be only one writing thread, and readers and writer can't run
at the same time both).

  The situation when all live threads wait on semaphores is called "deadlock",
such situations are detected by this library too.

    How to use it?

  Algorithms are written in OCaml using DSL, then packed in "task"
(for example, task can be "check that one reader and one writer never enter
reader and writer section at the same time", or "after execution of two
algorithms the value of variable x is 2").

  Then there are two options: 1. run exhaustive check (all paths of execution
are checked), 2. run random check (with optionally specified number of
simulations to run).

  In case of error its description is written to stderr: type of error
(deadlock, condition violation) and trace (which instructions from which
threads were executed).  However it's not convenient to analyze plain text,
so there exists optional argument ~to_html:"filename.html", it generates
html file with trace (table of instructions and shared memory state on each
step) and javascript ui to highlight trace steps that modified some
variable/semaphore/section and navigate to previous modification of
variable / wait/signal of semaphore / entering/exitting section.

  DSL/API is not a very long reading -- take a look at file "src/mtsim.mli".

    Toy examples.

  First, let's check that modifying shared variable without locking it
with mutex is a bad idea.

    let badmutator_task =
      (* we will mutate "badvar" variable, it's initial value is 0 *)
      let badvar = var "badvar" 0 in
      (* here is the mutator program: it consists of single "modify_var",
         which generates two instructions:
         1. load "badvar" value from shared memory to local thread stack,
         2. apply OCaml function "succ" to loaded value and write the result
            back to shared memory, where "badvar" is placed.
      let badmut = modify_var badvar succ in
      (* we will run these programs in parallel: *)
      let programs = [ badmut ; badmut ] in
      (* and the task description is: *)
      { programs = programs
      ; mids = [| |]
      ; afts = [| Var_eq (badvar, List.length programs) |]
          (* here we check that after execution of all threads
             variable "badvar" should have value 2 (since there are
             two programs to simulate *)
      ; task_name = "bad mutator"

  Then we run

    let () = check_task_exh badmutator_task ~to_html:"test/badmut.html"

  and get html file with trace.  It's available in repository of this project
as "test/badmut.html".  This file shows that the following execution path can
happen: both threads load badvar value (0) into their stacks, then each
thread adds 1 to it (thus getting 1) and stores it to shared memory.  Both
threads write 1 into badvar, while naive programmer could expect that variable
should always have value 2 after execution of these algorithms in parallel.

  Second example shows that locking mutexes in improper order can lead
to deadlock.  The task description is:

    let deadlock2 =
      let m1 = sem "m1" 1
      and m2 = sem "m2" 1
      { programs =
          [ ( wait m1 @ wait m2 @ signal m2 @ signal m1 )
          ; ( wait m2 @ wait m1 @ signal m1 @ signal m2 )
      ; afts = [| |]
      ; mids = [| |]
      ; task_name = "deadlock on 2 mutexes"

  (of course you know that semaphores can be used to emulate mutexes:
"wait" = "lock", "signal" = "unlock"; mtsim knows it too and provides
synonyms "lock" and "unlock" for "wait" and "signal".  But there are
no checks that semaphore used with lock and unlock can have values
0 or 1 only.)

  Then we run task:

    let () = check_task_exh deadlock2 ~to_html:"test/deadlock2.html"

  and mtsim generates file (available in this repository,
"test/deadlock2.html").  This file shows that it's possible to have such
threads preemptions so that first thread locks mutex m1, waits for mutex m2,
which is already locked by second thread and can't be unlocked while m1 is

  More examples can be found in "test/".


  It would be interesting to compile programs (which are lists for now) to
array of code, array of mutable data, and represent threads as "instruction
pointer + local stack".  This approach can potentially speed-up brute-force
(cache locality mainly, as intuition tells me).  Anybody wants to contribute?


- Dmitry Grebeniuk <> - idea, coding
- Stanislav Gritsyuk <> - javascript ui for html traces