This repository contains a collection of scripts to perform experiments with the Shore-MT storage manager and the benchmarks implemented by the Shore-Kits package. It consists of the following components:
- setup.sh Performs an initial one-time setup of environment parameters, e.g., paths to DB and log devices
- run.sh Performs a single benchmark run using the Shore-Kits package
- repeat.sh Performs multiple benchmark runs varying one single parameter across any range of values
- extract.sh Gathers experiment data from traces and logs generated by the repeat script
- genplots.sh Generates plots for the extracted data, including customized gnuplot templates used in publications
- loginspect A C++ program that inspects the log files left behind by a benchmark to extract relevant data and traces
- gnuplot/* A collection of (customized) gnuplot files used in various publications
The following sections document each component in detail
The script setup.sh is used to collect basic system parameters that will be used throughout the experiments. It supports two basic use cases:
File-based I/O In this setup, both the log and the DB will be stored in filesystem paths to which the invoking user has write permission. This is the standard scenario if I/O performance is not a critical aspect of the experiment. It can also be used to spare I/O delays and write only to a ramdisk (i.e., /dev/shm/shore).
Raw device I/O This setup is recommended if the experiment involves I/O-related measurements. It allows using raw device paths (e.g., /dev/sda or /dev/sda1) for either the log or the DB. The script must then be invoked as root. The following operations are performed for each raw device:
- Perform extensive checks to see if any data could be lost on the raw devices (e.g., if they contain a filesystem)
- Allow the invoking user to execute sbin programs (e.g., hdparm) using the /etc/sudoers file (requires sudo)
- [DB only] Grant the invoking user read/write permission on the raw device using udev
- [Log only] Add an exclusive filesystem entry to /etc/fstab to be used by the log, with permissions to the invoking user
The setup runs as an interactive prompt. Other than the path to the devices, the script asks for the path to the shore_kits executable and, in case of raw devices, the username of the user that will be invoking the experiments. Default values are given for each question.
The motivation behind the setup script is that benchmarks can be run on raw devices as a normal (non-root) user. Enforcing the script to be executed prior to any benchmark also provides extra security, since it performs extensive checks to avoid erasing data by entering a wrong path (which happened with me already).
Note: if the setup script was previously run with a different user, the udev setup will fail and the script will not finish the setup successfully. To setup the same DB device with a new user, just delete the file
/etc/udev/rules.d/80-shoremt.rules. Also remember to check the permissions and to unmount the log filesystem and if it was previously mounted with another user. Another option is to simply delete the filesystem -- the setup script will then reformat it.
Once the setup is finished, a file
setup.properties is generated containing variable assignments that will be imported by other scripts. For additional security, a file
.SETUP_DONE is also generated so that other scripts can make sure that setup was explicitly invoked by the user.
Running a benchmark
The script run.sh is used to perform a single benchmark run. The benchmark parameters can be set in two ways:
- Using variable assigments (properties files). The script reads benchmark parameters from bash variables with predefined names. See the file
parseopts.shfor the default values. To run a benchmark, a properties file is generated manually and given to run.sh as argument. The file is simply imported by the script, which means it can also contain arbitrary bash code. This allows for great flexibility while still being very simple. Check the folder
suitesfor some examples.
- [Deprecated] Using command-line arguments. Type
run.sh --helpor check the
parseopts.shfile for details of which parameters are supported. As the number of parameters kept growing, it became cumbersome to maintain this alternative, so the first one should be used instead.
The primary principle of the run script is that every possible parameter is specified exclusively with a bash variable. Parameters are passed to Shore-Kits via a generated
shore.conf file from the template in
shore.conf.in. This principle allows better control of experiments. For example, we can guarantee that two runs had the exact same configuration if and only if they used the same set of parameters. To facilitate this kind of verification, the script dumps the parameter values in a
config.properties file in the run directory. Furthermore, this principle facilitates the design of the repeat script (see below).
In the following, we give a brief overview of the most important parameters.
Any kind of log, trace, or dump produced by a benchmark run is saved in a special run directory, which can be specified in the variable
BASEDIR. It defaults to /dev/shm/shore. In I/O-intensive experiments, we recommend setting this to a ramdisk path, to minimize interference on benchmark I/O.
Benchmark setup (Shore-Kits)
The benchmark type is specified by the
BENCHTYPE variable. Possible values are:
tm1. The scaling factor is given in the
QUERIED_SF variable. Unlike Shore-Kits, we assume that the loaded and the queried SF are always the same.
Further benchmark parameters are basically those passed to Kits'
FIXED_TRX_COUNT If set to 1, run a fixed number of transactions (i.e.,
test command) instead of a fixed time duration (i.e.,
EXP_NUM_TRXS Number of transactions to run if running
EXP_DURATION Benchmark duration in seconds if running
SPREAD_TRXS Activates Kits' spread option (values 0 or 1), which assigns threads to transactions based on some logical partitioning criterion (e.g., warehouses in TPC-C).
NUM_THREADS Number of worker threads to spawn for benchmark
NUM_LOADERS Number of threads to use during loading phase
TRX_ID Benchmark-specific identifier for which transaction mix to run (default is 0 for all transactions)
The script supports three basic phases when running a benchmark: load, run, and recover. The first two always occur, while the third is optional. The behavior is controlled by the following parameters:
LOAD_AND_RUNIf set to 1, the load and run phases will be executed in a single Shore-Kits invocation. Otherwise, one initial invocation loads the DB and a second one simply runs the benchmark (uses Kits option
clobberdev). This is preferrable if we want to collect data for the load and run phases separately.
WARMUP_BUFFERIf set to 1, the Kits command
db_fetchwill be invoked before the benchmark. This option is only valid if the load and run phases are executed separately.
CRASH_AFTER_EXPIf set to 1, the kits command
crashwill be executed when the benchmark is finished and a recovery phase will be executed afterwards. Note that only REDO operations will be executed, because all transactions are finished at the time of the crash.
KILL_AFTERIf set to a number greater than zero, the Shore-Kits process will abort after the given number of seconds. If recovery is enabled, we would then observe both UNDO and REDO. This can be used to simulate a real crash by software means (AFAIK, as good as it gets for automated experiments).
Storage Manager options
The following options control the properties passed to the SM instance. They are placed in the generated
LOG_SIZESize of the active recovery log (i.e., the amount of log managed by
LOG_BUFFER_SIZESize of the log buffer used for flushing. [Note: when scanning the log, including in transaction rollbacks, Shore always goes directly to the partition files, so it seems that the buffer is really only used as a write buffer for log flushes].
BUFFER_RATIOSize of the buffer pool as a percentage of the loaded DB size. This is computed using the fixed and expected per-scaling-factor sizes (e.g., 130MB per warehouse in TPC-C), which means it's only a fair estimation. Also note that the DB usually grows during benchmarks, so set this to something higher than 1.0 to guarantee the absence of buffer replacements.
CHKPT_FREQUENCYCheckpoint frequency in seconds
CHKPT_INTERVALInterval at which older pages will be flushed to disk at each checkpoint (i.e., second-chance checkpoints). Set to -1 to never flush (traditional fuzzy checkpoint used by Shore) and to 0 for flushing all dirty pages at each checkpoint.
LOG_FLUSHESIf set to 1, page flushes invoked by the page cleaner will be logged to allow faster recovery.
PAGE_CLEANERIf set to 1, a page cleaner thread will be created when mounting a device (shore option
backgroundflush). Note that the page cleaner behavior may further be controlled by the cleaner policy (currently under development [TODO])
O_DIRECTIf set to yes, direct I/O for the database device will be used.
Debug and trace options
TRACEIf set to 1, Shore-Kits traces will be produced as well as any Shore-MT trace produced by the
DEBUG_FILESList of Shore-MT source files (separated by whitespace) for which debug traces will be produced (requires Shore to have been compiled with --enable-trace)
RUN_GDBIf set to 1, Shore-Kits will be invoked inside a GDB session
MONITOR_HARDWAREIf set to 1, traces will be collected for I/O and CPU utilization (using
mpstat). These traces can be processed by the extract script (see below)
Logs and traces
For each benchmark phase, files
*.stderr.txt will be written to
BASEDIR containing the Shore-Kits output. These files can be used to collect experiment data, since the Kits command
sm_stats is invoked after each phase. Other files generated include the output of the hardware monitors, of the log inspector (see below), debug traces, and a dump of the configuration parameters in the file
Iterating parameters and repeating
Sitting one level above the run script, the script repeat.sh can be used to perform multiple benchmark runs using controlled parameters. In essence, what one would call an experiment could be mapped to a single invocation of the repeat script.
The script takes as main argument one benchmark configuration file, i.e., a sequence of bash variable assignments as used in run.sh -- this will serve as the base benchmark configuration.
One variable, however, whose name is given in the
--var argument, will assume a different value for each benchmark invocation.
Such values are taken from a list given in the
Running one benchmark for each value in the range is what we call an iteration.
In order to run multiple iterations, the argument
--iterations can be used.
repeat.sh --help can be issued for a description of the script usage.
The script also takes two additional parameters related to the database log.
To save the log files of each benchmark run, the argument
--save-log must be provided.
Note that the compilation flag
KEEP_LOG_PARTITIONS can be used in Shore-MT to disable the deletion of log files and keep the complete history of a benchmark run.
Just be aware that the log volume grows very fast into the gigabyte range.
A second parameter that can be supplied is
--inspect-log, which will execute the loginspect program on the log after each benchmark run and save its output.
The program is described further below in more detail.
To understand the parameters in more detail, consider the following example:
repeat.sh \ --iterations 2 \ --var CHKPT_FREQ \ --range "2 5 10" \ --inspect-log \ --outdir /home/csauer/runtime/shore/experiment \ defaults.sh
Which gives the following output:
Iterations = 2 Var name = CHKPT_FREQ Range = 2 5 10 ==== Iteration 1 of 2 ==== Running with CHKPT_FREQ = 2 Running with CHKPT_FREQ = 5 Running with CHKPT_FREQ = 10 Saving experiment results to /home/csauer/runtime/shore/experiment/rep1 ==== Iteration 2 of 2 ==== Running with CHKPT_FREQ = 2 Running with CHKPT_FREQ = 5 Running with CHKPT_FREQ = 10 Saving experiment results to /home/csauer/runtime/shore/experiment/rep2
This experiment will perform 3 benchmark runs with the checkpoint frequency parameter set to 2, 5, and 10, respectively. Since 2 iterations were requested, there will be a total of 6 runs, so that we have two sample values for each checkpoint frequency.
The benchmark parameters will be read from the file
defaults.sh, exept for
CHKPT_FREQ which will be overriden by the repeat script.
--inspect-log will trigger the loginspect program (see below) after each benchmark run.
All traces and logs generated, including the output of the log inspector, and the log files if
--save-log is specified, will be stored in the output directory, in this case
Below we give a brief description of how files are organized in this directory.
The output directory of the example above will have the following layout after execution of the script:
For each iteration, a folder
repN will be created.
If the same experiment (i.e., the same output directory) was also used in an earlier invocation, the folders will be created using the lowest non-existing number.
This allows the same experiment to be performed over different invocations, e.g., every night for 5 days.
Inside each rep folder, one
trace_* folder is created for each value assumed by the experiment variable, in this case the checkpoit frequencies 2, 5, and 10.
Finally, all files output by the benchmark run are saved in the trace folder.
This directory structure will be exploited by the other scripts to collect experiment data.
From now on, we shall refer to the output directory of the repeat script as the experiment folder.
Handling failed runs
If a benchmark run fails, i.e., if the
shore_kits binary returns a non-zero status code, then a trace folder will not be created in the experiment folder.
This means that failed runs will leave "holes" in the experiment, which may cause the extraction phase to fail.
This design choice was made because believe we believe it is better to omit the results, causing extraction to fail, than to generate invalid data that can be interpreted falsely when analyzing the results (e.g., averages that include many zero-values).
To cope with this situation, the repeat script offers the argument
When set, it causes the script to go through the available rep folders and run the benchmark once again for each missing trace folder.
--run-missing activated, one single failed run causes the whole repeat script to exit with failure.
This means that this flag can be used to verify whether all repetitions completed sucessfully after an experiment is finished.
Extracting experiment data
The log files and traces generated during benchmarks contain useful information for various measurements. The script extract.sh allows measurements to be collected in a generic and automated way. The goal is to extract values for a set of stats. A stat is defined as a number that can be collected from a single benchmark run, such as number of commited transactions, number of dirty pages, etc. The key principle is that a single value is extracted from each benchmark run. Ways to extract multi-dimensional information, such as time series (e.g., number of dirty pages for each checkpoint or CPU utilization per second) will be discussed later on -- the extract script is used solely for stats.
The extract script can be thought of as a meta-script, because similarly to
run.sh, it takes another script as argument.
The argument script must define the set of stats what will be extracted, as well as how they can be extracted from the log files via
This is done by defining three bash arrays:
Consider the following example (taken from the sample file
LOAD_OUT=load*.kits.stdout.txt LOAD_ERR=load*.kits.stderr.txt RUN_OUT=*run.kits.stdout.txt RUN_ERR=*run.kits.stderr.txt REC_OUT=recover.kits.stdout.txt REC_ERR=recover.kits.stderr.txt COMP_STATS=computed_stats.txt LOGINS_OUT=loginspect.stdout.txt TITLE="commit_count"; STAT="commit_xct_cnt"; FILE=$RUN_OUT TITLE="dirty_pages"; STAT="restart_dirty_pages"; FILE=$REC_OUT TITLE="redo_time"; STAT="restart_redo_duration"; FILE=$REC_OUT TITLE="logs_redone"; STAT="restart_log_redone"; FILE=$REC_OUT TITLE="pages_dirtied"; STAT="bf_page_dirtied"; FILE=$RUN_OUT TITLE="recovery_time"; STAT="real"; FILE=$REC_ERR TITLE="load_time"; STAT="real"; FILE=$LOAD_ERR TITLE="log_bytes"; STAT="log_bytes_generated"; FILE=$RUN_OUT TITLE="write_bwidth"; STAT="write_bwidth"; FILE=$COMP_STATS TITLE="analysis_time"; STAT="restart_analysis_duration"; FILE=$REC_OUT TITLE="undo_time"; STAT="restart_undo_duration"; FILE=$REC_OUT TITLE="avg_dirty"; STAT="avg_dirty"; FILE=$COMP_STATS
The script defines 12 stats.
TITLE array contains the names we want to give to each stat.
STAT array, we specify a keywork that will be used to
grep-out the value of the stat.
The file on which
grep will be executed is then given in the
It must be a file existing in every
trace_* folder generated by the repeat script.
Instead of supporting arbitrary regular expressions to extract stat values, we rely on the following simple rule: The line containing a stat value must have the format
<title><separator><value>, where title is the value given in the
TITLE array, separator is a sequence of blank characters (whitespace or tab), and value is the value that will be assigned to the stat.
One important aspect is that the script given as parameter will be sourced by
extract.sh, which means it can contain arbitrary code to extract additional values of interest, not directly extractable with a single grep command.
In the example above, some stats are extracted from a file called
computed_stats.txt, which is not generated by the run script.
The sample file
stats/recovery.sh contains additional code that generates this file.
This generic mechanism allows the generic extraction logic to be reused across different experiments.
The collected stats will be saved inside each
repN folder in the file
The file contains a table where each line corresponds to one benchmark run, i.e., one
trace_* folder, and each column to one stat.
The script also aggregates the results of all reps in the
stats folder inside the experiment folder.
It contains a
stats.txt file containing the average values of all stats in each rep, as well one additional file for each stat.
These files contain a table with the values of a single stat in all reps and traces.
For each stat, a file
*_dev.txt is also generated containing the average. standard deviation, minimum, and maximum values of a stat across all reps.
Note that a value
? will be generated if the grep command does not match on the file given, or if the file does not exist.
This will be interpreted (e.g., by gnuplot) as an inexisting value.