Didactronic Toolkit

Author: Marc Perron
Date: 2016-02-27 11:43:11 EST

This package contains the source required to build the didactronic
toolkit C library which provides a framework for developing systems
which utilize reinforcement learning techniques. The library and
associated headers will be installed onto the target system, making
them available to developers who, by specifying the parameters of the
task to solve, can use the algorithms defined by the library.

Table of Contents
1 Installation Instructions
    1.1 Build Dependencies
    1.2 What's Built
    1.3 What's Installed
    1.4 What's Not Installed
2 Why Didactronic?

1 Installation Instructions 
  The Didactronic Toolkit utilizes the autotools[1] suite to
  facilitate a portable build process. The following procedure should
  be sufficient on most architectures:

  1. configure
  2. make
  3. make install

  If this does not work for your particular configuration, consult the
  configure help:

  ./configure --help

1.1 Build Dependencies 
   The didactronic toolkit uses strlcpy and strlcat which are not
   standard fuctions for anything but BSD. On Linux hosts, libbsd is
   required for the program to link. To install this library on Debian
   systems and its derivatives (i.e. Ubuntu, Mint, etc.), use the
   following command:

  sudo apt-get install libbsd-dev

1.2 What's Built 
   The make step will build the libdidactronic and libcontainers
   libraries in the flavour that is most appropriate for the target
   platform (.so by default on most Unix systems). Additionally,
   example programs will be built which illustrate how to use the
   cyberdyne toolkit:

   - gamblers-problem: A test program demonstrating value iteration of
     dynamic programming. The example used is the Gambler's Problem
     which is described as follows:

     A gambler has the opportunity to make bets on the outcomes of a
     sequence of coin flips. If the coin comes up heads, he win as
     many dollars as he as staked on that flip; if it is tails, he
     loses his stake. The game ends when the gambler wins by reaching
     his goal of $G, or loses by running out of money. On each flip,
     the gambler must decide what portion of his capital to stake, in
     integer number of dollars. This problem can be formulated as an
     undiscounted, episoded, finite MDP. The state is the gambler's
     capital, s in {1, 2,..., 99} and the actions are stakes, a in {1,
     2,..., min(s,G-s)}. The reward is zero on all transitions, except
     those on which the gambler reaches his goal, when it is +1. A
     state-value function then gives the probability of winning from
     each state. A policy is a mapping from levels of capital to
     stakes. The optimal policy maximizes the probability of reaching
     the goal.

   - tictactoe: This program uses policy iteration to generate the
     moves in a game of Tic-Tac-Toe. Policy iteration will occur after
     each game to update the policy based on the actions of the

1.3 What's Installed 
   The "make install" step will install the libdidactronic and
   libcontainers libraries into the standard locations (unless they
   are overriden by a configure option). The public header files which
   are needed to build programs against the toolkit are also installed
   on the target.

1.4 What's Not Installed 
   The example programs will not be installed on the target
   system. These are only examples to validate the toolkit and are not
   meant to be deployable apps.

2 Why Didactronic? 
  The name Didactronic was chosen for this package for a few reasons:

  1. Merriam-Webster defines the didactic as "designed or intended to
     teach people something"[2]. Since the toolkit is designed to
     support the development of reinforcement learning agents, this
     name seemed appropriate.

  2. The suffix -tronic, which is of Greek origin, referrs to a
     device, a tool, or an instrument[3]. The library associated
     with the toolkit can be thought of as an instrument in support of
     reinforcement learing agents.

  3. The toolkit is provide to support developers in designing agents
     which make use of reinforcement learning to solve a given
     task. The packaging of the run-time library and programming API
     makes it a toolkit.

[1] []

[2] []

[3] []