HTTPS SSH

README

This repository contains a collection of scripts to perform experiments with the Shore-MT storage manager and the benchmarks implemented by the Shore-Kits package. It consists of the following components:

  • setup.sh Performs an initial one-time setup of environment parameters, e.g., paths to DB and log devices
  • run.sh Performs a single benchmark run using the Shore-Kits package
  • repeat.sh Performs multiple benchmark runs varying one single parameter across any range of values
  • extract.sh Gathers experiment data from traces and logs generated by the repeat script
  • genplots.sh Generates plots for the extracted data, including customized gnuplot templates used in publications
  • loginspect A C++ program that inspects the log files left behind by a benchmark to extract relevant data and traces
  • gnuplot/* A collection of (customized) gnuplot files used in various publications

The following sections document each component in detail

Environment setup

The script setup.sh is used to collect basic system parameters that will be used throughout the experiments. It supports two basic use cases:

  1. File-based I/O In this setup, both the log and the DB will be stored in filesystem paths to which the invoking user has write permission. This is the standard scenario if I/O performance is not a critical aspect of the experiment. It can also be used to spare I/O delays and write only to a ramdisk (i.e., /dev/shm/shore).

  2. Raw device I/O This setup is recommended if the experiment involves I/O-related measurements. It allows using raw device paths (e.g., /dev/sda or /dev/sda1) for either the log or the DB. The script must then be invoked as root. The following operations are performed for each raw device:

    1. Perform extensive checks to see if any data could be lost on the raw devices (e.g., if they contain a filesystem)
    2. Allow the invoking user to execute sbin programs (e.g., hdparm) using the /etc/sudoers file (requires sudo)
    3. [DB only] Grant the invoking user read/write permission on the raw device using udev
    4. [Log only] Add an exclusive filesystem entry to /etc/fstab to be used by the log, with permissions to the invoking user

The setup runs as an interactive prompt. Other than the path to the devices, the script asks for the path to the shore_kits executable and, in case of raw devices, the username of the user that will be invoking the experiments. Default values are given for each question.

The motivation behind the setup script is that benchmarks can be run on raw devices as a normal (non-root) user. Enforcing the script to be executed prior to any benchmark also provides extra security, since it performs extensive checks to avoid erasing data by entering a wrong path (which happened with me already).

Note: if the setup script was previously run with a different user, the udev setup will fail and the script will not finish the setup successfully. To setup the same DB device with a new user, just delete the file /etc/udev/rules.d/80-shoremt.rules. Also remember to check the permissions and to unmount the log filesystem and if it was previously mounted with another user. Another option is to simply delete the filesystem -- the setup script will then reformat it.

Once the setup is finished, a file setup.properties is generated containing variable assignments that will be imported by other scripts. For additional security, a file .SETUP_DONE is also generated so that other scripts can make sure that setup was explicitly invoked by the user.

Running a benchmark

The script run.sh is used to perform a single benchmark run. The benchmark parameters can be set in two ways:

  1. Using variable assigments (properties files). The script reads benchmark parameters from bash variables with predefined names. See the file parseopts.sh for the default values. To run a benchmark, a properties file is generated manually and given to run.sh as argument. The file is simply imported by the script, which means it can also contain arbitrary bash code. This allows for great flexibility while still being very simple. Check the folder suites for some examples.
  2. [Deprecated] Using command-line arguments. Type run.sh --help or check the parseopts.sh file for details of which parameters are supported. As the number of parameters kept growing, it became cumbersome to maintain this alternative, so the first one should be used instead.

The primary principle of the run script is that every possible parameter is specified exclusively with a bash variable. Parameters are passed to Shore-Kits via a generated shore.conf file from the template in shore.conf.in. This principle allows better control of experiments. For example, we can guarantee that two runs had the exact same configuration if and only if they used the same set of parameters. To facilitate this kind of verification, the script dumps the parameter values in a config.properties file in the run directory. Furthermore, this principle facilitates the design of the repeat script (see below).

In the following, we give a brief overview of the most important parameters.

Run directory

Any kind of log, trace, or dump produced by a benchmark run is saved in a special run directory, which can be specified in the variable BASEDIR. It defaults to /dev/shm/shore. In I/O-intensive experiments, we recommend setting this to a ramdisk path, to minimize interference on benchmark I/O.

Benchmark setup (Shore-Kits)

The benchmark type is specified by the BENCHTYPE variable. Possible values are: tpcb, tpcc, tpch, tpce, and tm1. The scaling factor is given in the QUERIED_SF variable. Unlike Shore-Kits, we assume that the loaded and the queried SF are always the same.

Further benchmark parameters are basically those passed to Kits' measure and test commands: - FIXED_TRX_COUNT If set to 1, run a fixed number of transactions (i.e., test command) instead of a fixed time duration (i.e., measure). - EXP_NUM_TRXS Number of transactions to run if running test - EXP_DURATION Benchmark duration in seconds if running measure - SPREAD_TRXS Activates Kits' spread option (values 0 or 1), which assigns threads to transactions based on some logical partitioning criterion (e.g., warehouses in TPC-C). - NUM_THREADS Number of worker threads to spawn for benchmark - NUM_LOADERS Number of threads to use during loading phase - TRX_ID Benchmark-specific identifier for which transaction mix to run (default is 0 for all transactions)

Benchmark phases

The script supports three basic phases when running a benchmark: load, run, and recover. The first two always occur, while the third is optional. The behavior is controlled by the following parameters:

  • LOAD_AND_RUN If set to 1, the load and run phases will be executed in a single Shore-Kits invocation. Otherwise, one initial invocation loads the DB and a second one simply runs the benchmark (uses Kits option clobberdev). This is preferrable if we want to collect data for the load and run phases separately.
  • WARMUP_BUFFER If set to 1, the Kits command db_fetch will be invoked before the benchmark. This option is only valid if the load and run phases are executed separately.
  • CRASH_AFTER_EXP If set to 1, the kits command crash will be executed when the benchmark is finished and a recovery phase will be executed afterwards. Note that only REDO operations will be executed, because all transactions are finished at the time of the crash.
  • KILL_AFTER If set to a number greater than zero, the Shore-Kits process will abort after the given number of seconds. If recovery is enabled, we would then observe both UNDO and REDO. This can be used to simulate a real crash by software means (AFAIK, as good as it gets for automated experiments).

Storage Manager options

The following options control the properties passed to the SM instance. They are placed in the generated shore.conf file.

  • LOG_SIZE Size of the active recovery log (i.e., the amount of log managed by partition_t objects)
  • LOG_BUFFER_SIZE Size of the log buffer used for flushing. [Note: when scanning the log, including in transaction rollbacks, Shore always goes directly to the partition files, so it seems that the buffer is really only used as a write buffer for log flushes].
  • BUFFER_RATIO Size of the buffer pool as a percentage of the loaded DB size. This is computed using the fixed and expected per-scaling-factor sizes (e.g., 130MB per warehouse in TPC-C), which means it's only a fair estimation. Also note that the DB usually grows during benchmarks, so set this to something higher than 1.0 to guarantee the absence of buffer replacements.
  • CHKPT_FREQUENCY Checkpoint frequency in seconds
  • CHKPT_INTERVAL Interval at which older pages will be flushed to disk at each checkpoint (i.e., second-chance checkpoints). Set to -1 to never flush (traditional fuzzy checkpoint used by Shore) and to 0 for flushing all dirty pages at each checkpoint.
  • LOG_FLUSHES If set to 1, page flushes invoked by the page cleaner will be logged to allow faster recovery.
  • PAGE_CLEANER If set to 1, a page cleaner thread will be created when mounting a device (shore option backgroundflush). Note that the page cleaner behavior may further be controlled by the cleaner policy (currently under development [TODO])
  • O_DIRECT If set to yes, direct I/O for the database device will be used.

Debug and trace options

  • TRACE If set to 1, Shore-Kits traces will be produced as well as any Shore-MT trace produced by the DBG macros.
  • DEBUG_FILES List of Shore-MT source files (separated by whitespace) for which debug traces will be produced (requires Shore to have been compiled with --enable-trace)
  • RUN_GDB If set to 1, Shore-Kits will be invoked inside a GDB session
  • MONITOR_HARDWARE If set to 1, traces will be collected for I/O and CPU utilization (using iostat and mpstat). These traces can be processed by the extract script (see below)

Logs and traces

For each benchmark phase, files *.stdout.txt and *.stderr.txt will be written to BASEDIR containing the Shore-Kits output. These files can be used to collect experiment data, since the Kits command sm_stats is invoked after each phase. Other files generated include the output of the hardware monitors, of the log inspector (see below), debug traces, and a dump of the configuration parameters in the file config.properties.

Iterating parameters and repeating

Sitting one level above the run script, the script repeat.sh can be used to perform multiple benchmark runs using controlled parameters. In essence, what one would call an experiment could be mapped to a single invocation of the repeat script.

The script takes as main argument one benchmark configuration file, i.e., a sequence of bash variable assignments as used in run.sh -- this will serve as the base benchmark configuration. One variable, however, whose name is given in the --var argument, will assume a different value for each benchmark invocation. Such values are taken from a list given in the --range argument. Running one benchmark for each value in the range is what we call an iteration. In order to run multiple iterations, the argument --iterations can be used. The command repeat.sh --help can be issued for a description of the script usage.

The script also takes two additional parameters related to the database log. To save the log files of each benchmark run, the argument --save-log must be provided. Note that the compilation flag KEEP_LOG_PARTITIONS can be used in Shore-MT to disable the deletion of log files and keep the complete history of a benchmark run. Just be aware that the log volume grows very fast into the gigabyte range. A second parameter that can be supplied is --inspect-log, which will execute the loginspect program on the log after each benchmark run and save its output. The program is described further below in more detail.

Example

To understand the parameters in more detail, consider the following example:

repeat.sh \
  --iterations 2 \
  --var CHKPT_FREQ \
  --range "2 5 10" \
  --inspect-log \
  --outdir /home/csauer/runtime/shore/experiment \
  defaults.sh

Which gives the following output:

Iterations = 2
Var name = CHKPT_FREQ
Range = 2 5 10
==== Iteration 1 of 2 ====
Running with CHKPT_FREQ = 2
Running with CHKPT_FREQ = 5
Running with CHKPT_FREQ = 10
Saving experiment results to /home/csauer/runtime/shore/experiment/rep1
==== Iteration 2 of 2 ====
Running with CHKPT_FREQ = 2
Running with CHKPT_FREQ = 5
Running with CHKPT_FREQ = 10
Saving experiment results to /home/csauer/runtime/shore/experiment/rep2

This experiment will perform 3 benchmark runs with the checkpoint frequency parameter set to 2, 5, and 10, respectively. Since 2 iterations were requested, there will be a total of 6 runs, so that we have two sample values for each checkpoint frequency. The benchmark parameters will be read from the file defaults.sh, exept for CHKPT_FREQ which will be overriden by the repeat script. The flag --inspect-log will trigger the loginspect program (see below) after each benchmark run. All traces and logs generated, including the output of the log inspector, and the log files if --save-log is specified, will be stored in the output directory, in this case /home/csauer/runtime/shore/experiment. Below we give a brief description of how files are organized in this directory.

Output

The output directory of the example above will have the following layout after execution of the script:

  • experiment
    • rep1
      • trace_2
        • load.kits.stderr.txt
        • load.kits.stdout.txt
        • run.kits.stderr.txt
        • ...
      • trace_5
      • trace_10
    • rep2
      • ...

For each iteration, a folder repN will be created. If the same experiment (i.e., the same output directory) was also used in an earlier invocation, the folders will be created using the lowest non-existing number. This allows the same experiment to be performed over different invocations, e.g., every night for 5 days. Inside each rep folder, one trace_* folder is created for each value assumed by the experiment variable, in this case the checkpoit frequencies 2, 5, and 10. Finally, all files output by the benchmark run are saved in the trace folder. This directory structure will be exploited by the other scripts to collect experiment data. From now on, we shall refer to the output directory of the repeat script as the experiment folder.

Handling failed runs

If a benchmark run fails, i.e., if the shore_kits binary returns a non-zero status code, then a trace folder will not be created in the experiment folder. This means that failed runs will leave "holes" in the experiment, which may cause the extraction phase to fail. This design choice was made because believe we believe it is better to omit the results, causing extraction to fail, than to generate invalid data that can be interpreted falsely when analyzing the results (e.g., averages that include many zero-values). To cope with this situation, the repeat script offers the argument --run-missing. When set, it causes the script to go through the available rep folders and run the benchmark once again for each missing trace folder. With --run-missing activated, one single failed run causes the whole repeat script to exit with failure. This means that this flag can be used to verify whether all repetitions completed sucessfully after an experiment is finished.

Extracting experiment data

The log files and traces generated during benchmarks contain useful information for various measurements. The script extract.sh allows measurements to be collected in a generic and automated way. The goal is to extract values for a set of stats. A stat is defined as a number that can be collected from a single benchmark run, such as number of commited transactions, number of dirty pages, etc. The key principle is that a single value is extracted from each benchmark run. Ways to extract multi-dimensional information, such as time series (e.g., number of dirty pages for each checkpoint or CPU utilization per second) will be discussed later on -- the extract script is used solely for stats.

The extract script can be thought of as a meta-script, because similarly to run.sh, it takes another script as argument. The argument script must define the set of stats what will be extracted, as well as how they can be extracted from the log files via grep. This is done by defining three bash arrays: TITLE, STAT, and FILE. Consider the following example (taken from the sample file stats/recovery.sh):

LOAD_OUT=load*.kits.stdout.txt
LOAD_ERR=load*.kits.stderr.txt
RUN_OUT=*run.kits.stdout.txt
RUN_ERR=*run.kits.stderr.txt
REC_OUT=recover.kits.stdout.txt
REC_ERR=recover.kits.stderr.txt
COMP_STATS=computed_stats.txt
LOGINS_OUT=loginspect.stdout.txt

TITLE[0]="commit_count";    STAT[0]="commit_xct_cnt";            FILE[0]=$RUN_OUT
TITLE[1]="dirty_pages";     STAT[1]="restart_dirty_pages";       FILE[1]=$REC_OUT
TITLE[2]="redo_time";       STAT[2]="restart_redo_duration";     FILE[2]=$REC_OUT
TITLE[3]="logs_redone";     STAT[3]="restart_log_redone";        FILE[3]=$REC_OUT
TITLE[4]="pages_dirtied";   STAT[4]="bf_page_dirtied";           FILE[4]=$RUN_OUT
TITLE[5]="recovery_time";   STAT[5]="real";                      FILE[5]=$REC_ERR
TITLE[6]="load_time";       STAT[6]="real";                      FILE[6]=$LOAD_ERR
TITLE[7]="log_bytes";       STAT[7]="log_bytes_generated";       FILE[7]=$RUN_OUT
TITLE[8]="write_bwidth";    STAT[8]="write_bwidth";              FILE[8]=$COMP_STATS
TITLE[9]="analysis_time";   STAT[9]="restart_analysis_duration"; FILE[9]=$REC_OUT
TITLE[10]="undo_time";      STAT[10]="restart_undo_duration";    FILE[10]=$REC_OUT
TITLE[11]="avg_dirty";      STAT[11]="avg_dirty";                FILE[11]=$COMP_STATS

The script defines 12 stats. The TITLE array contains the names we want to give to each stat. In the STAT array, we specify a keywork that will be used to grep-out the value of the stat. The file on which grep will be executed is then given in the FILE array. It must be a file existing in every trace_* folder generated by the repeat script. Instead of supporting arbitrary regular expressions to extract stat values, we rely on the following simple rule: The line containing a stat value must have the format <title><separator><value>, where title is the value given in the TITLE array, separator is a sequence of blank characters (whitespace or tab), and value is the value that will be assigned to the stat.

One important aspect is that the script given as parameter will be sourced by extract.sh, which means it can contain arbitrary code to extract additional values of interest, not directly extractable with a single grep command. In the example above, some stats are extracted from a file called computed_stats.txt, which is not generated by the run script. The sample file stats/recovery.sh contains additional code that generates this file. This generic mechanism allows the generic extraction logic to be reused across different experiments.

The collected stats will be saved inside each repN folder in the file stats.txt. The file contains a table where each line corresponds to one benchmark run, i.e., one trace_* folder, and each column to one stat. The script also aggregates the results of all reps in the stats folder inside the experiment folder. It contains a stats.txt file containing the average values of all stats in each rep, as well one additional file for each stat. These files contain a table with the values of a single stat in all reps and traces. For each stat, a file *_dev.txt is also generated containing the average. standard deviation, minimum, and maximum values of a stat across all reps.

Note that a value ? will be generated if the grep command does not match on the file given, or if the file does not exist. This will be interpreted (e.g., by gnuplot) as an inexisting value.