riak client_id / doc / basic-setup.txt

Full commit
Riak Setup Instructions

This document explains how to set up a Riak cluster.  It assumes that
you have already downloaded an successfully built Riak.  For help with
those steps, please refer to riak/README.


Riak has many knobs to tweak, affecting everything from distribution
to disk storage.  This document will attempt to give a description of
the common configuration parameters, then describe two typical setups,
one small, one large.


Configurations are stored in the simple text files vm.args and
app.config.  Initial versions of these files are stored in the
rel/overlay/etc/ subdirectory of the riak source tree.  When a release
is generated, these "overlays" are copied to rel/riak/etc/.


The vm.args configuration file sets the parameters passed on the
command line to the Erlang VM.  Lines starting with a '#' are
comments, and are ignored.  The other lines are concatenated and
passed on the command line verbatim.

Two important parameters to configure here are "-name", the name to
give the Erlang node running Riak, and "-setcookie", the cookie that
all Riak nodes need to share in order to communicate.


The app.config configuration file is formatted as an Erlang VM config
file.  The syntax is simply:

 {AppName, [
            {Option1, Value1},
            {Option2, Value2},

Normally, this will look something like:

 {riak, [
         {storage_backend, riak_dets_backend},
         {riak_dets_backend_root, "data/dets"}
 {sasl, [
         {sasl_error_logger, {file, "log/sasl-error.log"}}

This would set the 'storage_backend' and 'riak_dets_backend_root'
options for the 'riak' application, and the 'sasl_error_logger' option
for the 'sasl' application.

The following parameters can be used in app.config to configure Riak
behavior.  Some of the terminology used below is better explained in

cluster_name: string
  The name of the cluster.  Can be anything.  Used mainly in saving
  ring configuration.  All nodes should have the same cluster name.

gossip_interval: integer
  The period, in milliseconds, at which ring state gossiping will
  happen.  A good default is 60000 (sixty seconds).  Best not to
  change it unless you know what you're doing.

ring_creation_size: integer
  The number of partitions to divide the keyspace into.  This can be
  any number, but you probably don't want to go lower than 16, and
  production deployments will probably want something like 1024 or
  greater.  This is a very difficult parameter to change after your
  ring has been created, so choose a number that allows for growth, if
  you expect to add nodes to this cluster in the future.

ring_state_dir: string
  Directory in which the ring state should be stored.  Ring state is
  stored to allow an entire cluster to be restarted.

storage_backend: atom

  Name of the module that implements the storage for a vnode.  The
  four backends that ship with Riak are riak_fs_backend,
  riak_ets_backend, riak_dets_backend, and riak_osmos_backend. Some
  backends have their own set of configuration parameters.

    A backend that uses the filesystem directly to store data.  Data
    are stored in Erlang binary format in files in a directory
    structure on disk.

    riak_fs_backend_root: string
      The directory under which this backend will store its files.

    A backend that uses ETS to store its data.

    A backend that uses DETS to store its data.

    riak_dets_backend_root: string
      The directory under which this backend will store its files.

    A backend that uses Osmos to store its data.

    riak_osmos_backend_root: string
      The directory under which this backend will store its files.

    riak_osmos_backend_block_size: integer
      The "block size" configuration parameter for Osmos.
      Defaults to 2048.

Single-node Configuration

If you're running a single Riak node, you likely don't need to change
any configuration at all.  After compiling and generating the release
("./rebar compile generate"), simply start Riak from the rel/
directory.  (Details about the "riak" control script can be found in
the README.)

Large (Production) Configuration

If you're running any sort of cluster that could be labeled
"production", "deployment", "scalable", "enterprise", or any other
word implying that the cluster will be running interminably with
on-going maintenance, then you will want to change configurations a
bit.  Some recommended changes:

* Uncomment the "-heart" line in vm.args.  This will cause the "heart"
  utility to monitor the Riak node, and restart it if it stops.

* Change the name of the Riak node in vm.args from riak@ to
  riak@VISIBLE.HOSTNAME.  This will allow Riak nodes separate machines
  to communicate.

* Change 'riak_web_ip' in app.config if you'll be accessing that
  interface from a non-host-local client.

* Consider adding a 'ring_creation_size' entry to app.config, and
  setting it to a number higher than the default of 64.  More
  partitions will allow you to add more Riak nodes later, if you need

* Consider changing the 'riak_storage_backend' entry in app.config.
  Depending on your use case, riak_dets_backend may not be your best

To get the cluster, up and running, first start Riak on each node with
the usual "riak start" command.  Next, tell each node to join the
cluster with the riak-admin script:

box2$ bin/riak-admin join
Sent join request to

To check that all nodes have joined the cluster, attach a console to
any node, and request the ring from the ring manager, then check that
all nodes are represented:

$ bin/riak attach
Attaching to /tmp/erlang.pipe.1 (^D to exit)
(> {ok, R} = riak_ring_manager:get_my_ring().
(> riak_ring:all_members(R).

Your cluster should now be ready to accept requests.  See
riak/doc/basic-client.txt for simple instructions on connecting and
storing and fetching data, though you'll need to use an Erlang node
name for your client that isn't hosted on "".

Starting more nodes in production is just as easy:

1. Install Riak on another host, modifying hostnames in configuration
   files, if necessary.
2. Start the node with "riak start"
3. Add the node to the cluster with
   "riak-admin join ExistingClusterNode"

Developer Configuration

If you're hacking on Riak, and you need to run multiple nodes on a
single physical machine, use the "devrel" make command:

$ make devrel
mkdir dev
cp -R rel/overlay rel/reltool.config dev
./rebar compile && cd dev && ../rebar generate
==> mochiweb (compile)
==> webmachine (compile)
==> riak (compile)
==> dev (generate)
Generating target specification...
Constructing release...
cp -Rn dev/riak dev/dev1
cp -Rn dev/riak dev/dev2
cp -Rn dev/riak dev/dev3

This make target creates a release, and then modifies configuration
files such that each Riak node uses a different Erlang node name
(riak1-3), web port (8091-3), data directory (dev/dev1-3/data/), etc.

Start each developer node just as you would a regular node, but use
the 'riak' script in dev/devN/bin/.


Viewing the activity log of a given riak node is done using the
riak-admin script.  Just pass it the command "logger", the Erlang node
name of the Riak node, and the cookie for that node:

$ riak-admin logger dev1@ riak
[13/Jan/2010:14:45:37 -0500]: {riak_connect,send_ring_to,'dev1@','dev2@'}
[13/Jan/2010:14:45:37 -0500]: {riak_ring_manager,write_ringfile,'dev1@',
[13/Jan/2010:14:45:37 -0500]: {riak_connect,changed_ring,'dev2@',gossip_changed}