HTTPS SSH

Atlassian Performance Testing Framework

The Performance Testing Framework, previously known as the Elastic Experiment Executor (E³) is a framework to set up, execute, and analyze performance under load on Atlassian Server and Data Center product instances. These experiments can be useful to compare the product performance of different versions, software configurations, and hardware infrastructure setups under workloads of different sizes and shapes.

While the experiment runs, both system-level and application-level metrics are monitored from all machines involved using the collectd service. This information, as well as data on response times and throughput gathered from client and server-side log files, can be summarized in a number of different graphical charts and visualizations.

Currently, the Performance Testing Framework can run experiments on Bitbucket Server, Bitbucket Data Center, and Confluence Data Center. All compute infrastructure can be provisioned automatically in the Amazon Web Services (AWS) cloud. All three products additionally support running experiments on external infrastructure.

Note: Even if your infrastructure is hosted on AWS, it is still considered external infrastructure if it has not been provisioned by the Performance Testing Framework. In addition to the setup and configuration overview described below, you'll also need to refer to the section titled Bring your own infrastructure for more specific instructions on how to tailor the framework for your purposes.

Setup

Pre-requisites

Software

The framework requires that experiments be run from a Linux or MacOS machine with a number of open source software packages. The easiest way to install these pre-requisites is with your operating system's package manager. For example, from the machine which will be used to run the Performance Testing Framework experiments, you might run one of the following:

MacOS with homebrew
brew    install gnuplot --with-cairo
brew    install imagemagick python rrdtool
Linux Ubuntu, Debian, ...
apt-get install gnuplot imagemagick python rrdtool librrd-dev
Linux Red Hat, CentOS, ...
yum     install gnuplot imagemagick python rrdtool

For other systems, refer to your system's documentation on how to install these prerequisites.

AWS

The Performance Testing Framework provides the ability to easily provision machines in the AWS Cloud.

You can specify the credentials for your AWS account in any of the places that boto3 looks. (See configuring credentials.)

If your organization uses IAM roles to autheticate with AWS, the framework also includes the ability to acquire AWS credentials automatically. For an example implementation see AtlassianAwsSecurity.py.

  .     Warning:
 / \    Provisioning AWS infrastructure can incur significant service charges from Amazon.
/ ! \   You are responsible for all Amazon service charges arising from your use of the Performance Testing Framework.
‾‾‾‾‾   You are responsible for securing any proprietary data on your test machines.

Installing

Most of the Performance Testing Framework is written in Python and requires a number of Python libraries. These requirements are described in a standard requirements.txt file, and can be installed using the standard Python pip and virtualenv tools as follows:

# From the elastic-experiment-executor repository home
sudo pip install virtualenv
virtualenv .ve
source .ve/bin/activate
cd e3; pip install -r requirements.txt

Quick start

Once all requirements are installed, running the Orchestrate.py script with any one of the example experiments (found in e3-home/experiments) will provision machines in AWS with cluster and client nodes, execute a specified workload, gather data, and analyze the results. In the example here, we execute the cluster-scaling-ssh experiment, which is designed to stress different Bitbucket instances with lots of parallel Git hosting operations over SSH:

./Orchestrate.py -e cluster-scaling-ssh

Note: if you are running an experiment for Confluence Data Center, you will need a developer license. If you already have a production license, see how to get a Confluence developer license for information. Once you've got it, look for the property confluence.license.key in your experiment file and replace <confluence_license_key> with your license. Note that your experiments will be limited by the number of users your license supports, which is determined by the production license.


Experiments

Defining an Experiment

Experiments are described in JSON files that define:

  • the independent experiment thread(s) to run (in parallel),
  • the type of instance under test,
  • the size and shape of the instance(s) under test,
  • the shape of the data (projects, repositories, users, etc.) in the instance(s),
  • the size and shape of the client machine(s) that put load on the instance(s), and
  • the mix of operations in the workload, and
  • the amount of load to apply in each stage of the experiment (the number of client threads).

There is a library of pre-defined experiments under e3-home/experiments, or you can define your own.

The workloads referenced in the experiment JSON file are defined in e3-home/workloads.

Experiment Phases

Performance experiments have a number of phases, which correspond to the following scripts:

  1. Provision, which spins up product instance(s), cluster(s) of worker machines, and other associated infrastructure in AWS. Will automatically create a 'run file' in e3-home/runs from your experiment definition. (For Bitbucket Server, you can skip this step if you have your own infrastructure, but you will need to create a run file and fill in your machines' details first -- see below for more information)
  2. Run, which actually runs the client workload from the worker machine(s) (and also resets the product instance(s) to a known good state in between stages).
  3. Gather which gathers data from an experiment run (either running or finished) off all the machines.
  4. Analyze which analyzes the data from an experiment run into charts that help visualize how well the instance(s) performed and all their "vital signs" while under stress.
  5. Deprovision which deprovisions the test machines and archives the experiment run.

In addition to the global Orchestrate scripts, which runs all these phases in order, each phase can also be run individually:

cd ./e3
./Provision.py -e <experiment_name>
./Run.py -r <run_name>
./Gather.py -r <run_name>
./Analyze.py -r <run_name>

Workloads

A workload defines the types and distribution of execution scripts run in your experiment. You can only define a single workload per experiment, but each workload can contain multiple scripts. Workload descriptors are stored in e3-home/workloads.

A workload description contains:

  • script: the path of a script to be executed during the experiment relative to the e3/execution/grinder directory
  • description: the name of the description; used in graphs produced by analysis
  • scale: the relative weight to give the experiment in the analysis stage. Can be used to reduce the visual impact of less important or cheaper operations.
  • weight: the weight to give to this execution script. All weights should add up to 1.0

Weights

The way the Performance Testing Framework assigns work to client threads and workers means that weights in a workload definition do not have infinite resolution. An example to illustrate this:

experiment.json
{
    ...
    "stages": [
        {
            "clients": 10
        },
        {
            "clients": 20
        }
    ]
}

Because the experiment defined above is running a stage with only 10 clients, the maximum resolution the weights can afford is 1/10 or 0.1. Therefore defining a workload that contains a script with a weight under 0.1 may not be executed.

The framework will also attempt to distribute client threads amongst all worker machines evenly. This means when choosing the number of client threads for each stage, you should aim to ensure the number is divisible by the number of worker machines you have allocated.

Analysis

Performance Testing Framework provides the ability to produce graphs from the data collected from the worker machines as well as the production machines.

There are 2 types of analysis scripts included:

  • gnuplot
  • rrdtool

Calling ./e3/Analyze.py -r <run_name> will produce both types of graphs for a specific run.

To produce only one type of graph:

./e3/analysis/gnuplot/Analyze.py -r <run_name>

or

./e3/analysis/rrdtool/Analyze.py -r <run_name>

Any produced graphs will be stored in the e3-home/runs/<run_name>/analysis directory.

Note: in order to get graphs from both rrdtool and gnuplot after an experiment run, you must use the Analyze.py script. Orchestrate.py currently only produces graphs using rrdtool. For more information on Analysis and the types of graphs the framework produces, see the dedicated README.

Datasets

Bitbucket and Confluence use and supply datasets to populate instances for load testing in experiments differently.

Bitbucket snapshots

Snapshot descriptor files are used to describe a Bitbucket dataset. They are stored in e3-home/snapshots. Atlassian provides a number of default snapshots:

These default snapshots all have:

for the data described in the JSON snapshot descriptor.

If you are running experiments in AWS, these default snapshots can help to quickly get you up and running experiments.

If you are running experiments elsewhere, you will need to provide the frameowrk with a descriptor of the dataset of the Bitbucket instance you are running an experiment against. To generate a new snapshot descriptor you can use the Snapshot script:

./util/Snapshot.py --url "https://bitbucket-instance" --username admin --password admin --name snapshot-name --description "snapshot description"
Creating your own dataset in AWS

After you have run the ./util/Snapshot.py script with the --aws true option, you will need to take an EBS (and possibly RDS and Elasticsearch) snapshot of your data. This is most easily achieved through the use of the Atlassian Bitbucket DIY backup scripts, which come with the Atlassian Bitbucket AMI. Alternatively, you can clone the repository from bitbucket.org.

See Using Bitbucket Server DIY backup in AWS.

You will then need to configure the snapshot fields for EBS (git repositories), RDS (database) and ES (search index) in the generated snapshot (see example below)

The following example configuration specifies snapshots taken from a Bitbucket Data Center instance.

"ebs": {
    "ap-southeast-2": "snap-4ee80fde",
    "us-east-1": "snap-25e718aa"
},
"rds": {
    "account": "098706035825",
    "id": "e3-small"
},
"es": {
    "id" : "e3-small",
    "bucket": "bitbucket-server-e3"
}
SSH Keys

A keys file can be used to store public/private key pairs, which users are supposed to use to authenticate with Bitbucket when running an experiment. These descriptors are stored in e3-home/keys.

If you are using an existing dataset, you will need to provide the public and private SSH keys that the framework can use to authenticate over SSH. If you provide a key-file to the Snapshot script, it will look up the users in your instance and map the private key to the configured public key for each user.

Passwords

Currently, the Snapshot script assumes all users have the same password (password by default).

Note: Every user is assumed to have access to every project/repository. Test scripts are not resilient to failures because of insufficient permissions.

Confluence spaces

Confluence Data Center instances are populated using the file called space-import.zip.xml found at the location specified in a Confluence experiment file. By default, this is the Confluence resources directory. Although you can replace this file with your own space export file, it is not recommended because the existing space contains specific content that is used by the scripts during testing.


Customizing your configuration

In addition to custom datasets, the Performance Testing Framework also supports additional product-specific custom configuration for fine-tuning the shape and size of the product instances and the nature of the load testing.

Bitbucket Server

There are a number of properties that allow you to customize your experiment for Bitbucket. The following basic json contains an example of their usage.

{
    "product": "Bitbucket",
    "duration": 600000,
    "workload": "bitbucket-mixed",
    "threads": [
        {
            "name": "1-node",
            "instance": {
                "name": "1-node",
                "version": "4.11.0",
                "snapshot": "e3-small",
                "template": "BitbucketDataCenter",
                "properties": [
                    "plugin.bitbucket-git.throttle.ref.advertisement=false",
                    "plugin.bitbucket-scm-cache.upload-pack.enabled=false",
                    "throttle.resource.scm-hosting.strategy=adaptive",
                    "throttle.resource.scm-hosting.adaptive.limit.min=8",
                    "throttle.resource.scm-hosting.adaptive.limit.max=100",
                    "throttle.resource.scm-hosting.adaptive.interval=2",
                    "throttle.resource.scm-hosting.adaptive.growth.max=1.0",
                    "throttle.resource.scm-hosting.adaptive.target.cpu=0.8"
                ],
                "parameters": {
                    "ClusterNodeMin": "1",
                    "ClusterNodeMax": "1"
                }
            },
            "worker": {
                "name": "worker-cluster-1",
                "template": "WorkerCluster",
                "parameters": {
                    "ClusterSize": "4"
                }
            },
            "stages": [
                {
                  "clients": 40
                },
                {
                  "clients": 80
                },
                {
                  "clients": 160
                }
            ]
        }
    ]
}

For further custom configurations, see how to override CloudFormation template or application properties using the bitbucket.properties file.

Confluence Server

Anatomy of a basic experiment

{
    "product": "Confluence",
    "duration": 180000,
    "workload": "confluence-mixed",
    "threads": [
        {
            "name": "1-node",
            "instance": {
                "name": "1-node",
                "version": "6.1.0",
                "template": "ConfluenceDataCenter",
                "properties": [
                    "confluence.data.directory=resources/confluence",
                    "confluence.license.key=<license-key-string>",
                    "confluence.number.of.instances=1",
                    "confluence.number.of.users=60",
                    "confluence.min.users.to.run.experiment=60"
                ]
            },
            "worker": {
                "name": "2-node-worker-cluster",
                "template": "WorkerCluster",
                "parameters": {
                    "ClusterSize": "2"
                }
            },
            "stages": [
                {
                    "clients": 20
                },
                {
                    "clients": 40
                },
                {
                    "clients": 60
                }
            ]
        }
}

The basic experiment shown above consists of two configuration components, Grinder and Confluence. The custom configuration for each is detailed below.

Grinder custom configuration

The following properties specify how the load test will be carried out by Grinder:

  • duration: how long to run the load test in milliseconds (ms)
  • workload: the name of the workload (as defined in e3-home/workloads](./e3-home/workloads) ) to run in each experiment thread
  • worker: the client configuration
    • ClusterSize: determines how many nodes will be acting as clients running scripts against the Confluence instance
  • stages: how many consecutive times to run an experiment on an instance and how many clients to simulate at each stage
Confluence custom configuration

The following properties can be specified to customize your Confluence instance:

  • version: the Confluence product version to spin-up
  • properties: a list of properties that Confluence uses during set up
    • confluence.data.directory: path relative to e3-home where Confluence set up data can be found, including the space import and custom mail server configuration
    • confluence.license.key: license needed to set up the Confluence instance (NB: slashes need to be escaped in json)
    • confluence.number.of.instances: the size of the Confluence cluster to set up
    • confluence.number.of.users: the maximum number of clients simulated in an experiment thread over all stages
    • confluence.min.users.to.run.experiment: in case of some setup error, the minimum number of created users required to run the experiment; default value is 0 -- the experiment will try to run even if no users are successfully created.
    • mailtrap.api.token: optional parameter specifying a MailTrap server to use for testing email notification load. If a token is not provided, it will default to the custom mail configuration file mail-server.properties provided.

Bring your own infrastructure

The Performance Testing Framework allows you to run tests against your own infrastructure if you don't want to use AWS. Currently it only supports Linux systems. In order to do so you will need to:

  1. Setup and configure a product instance and some additional software used to collect performance metrics.
  2. Write a configuration file describing your product instance(s).
  3. Configure a worker cluster that will be responsible for generating load and write a configuration describing it.
  4. Write an experiment file describing the performance test.

Setting up your server product instance

Setting up Bitbucket

This section describes the steps required to setup Bitbucket Server so that it can be used to serve experiment requests while also reporting performance metrics.

  1. Create a public/private key pair that will be used to authenticate against each node in your cluster. The same key pair must be used across all cluster nodes.
  2. Set up an e3-user with the keypair setup above.
  3. Ensure the SSH daemon is running on each node serving the Bitbucket cluster instances(s)
  4. Choose a filesystem and database snapshot that is representative of the data that you expect to see in your production instances.
  5. Configure Bitbucket server with the filesystem and database selected above.
  6. Generate a Snapshot for your test data and copy it to e3-home/snapshots.
  7. Enable JMX monitoring for your Bitbucket instance.
  8. Install the collectd daemon and configure it with the example bitbucket-collectd.conf configuration file contained within this repository. You may need to make some adjustments depending on your Linux distribution / collectd installation. See the dedicated README for more details. Make sure the daemon is running when the test begins.
  9. Write a json file describing your Bitbucket server instance containing the following fields:
    • List of ClusterNodes connection strings
    • A URL for Bitbucket server
    • The name of the snapshot file generated in step 3.
    • admin_password the administrator password for the system under test.
  10. Copy both the json file describing the instance and the private key file for the instance to your e3-home/instances directory. Both files should have identical names with different extensions .json and .pem respectively.

Example two node Bitbucket server configuration file.

{
  "ClusterNodes": [
    "e3-user@bitbucket-server-node-1",
    "e3-user@bitbucket-server-node-2"
  ],
  "URL": "http://bitbucket-server",
  "snapshot": "my-snapshot",
  "RunConfig": {
    "admin_password": "s3cr3t"
  }
}

Setting up Confluence

This section describes the steps required to set up Confluence so that it can be used to serve experiment requests while also reporting performance metrics.

Note: As part of the set up, you will need to create new test users, as well as provide authentication credentials for the framework to remove / import / modify test data. If you are using the Performance Testing Framework on an instance with important data, it is highly recommended that you back up Confluence first.

  1. Create a public/private key pair that will be used to authenticate against each node in your cluster. The same key pair must be used across all cluster nodes.
  2. Set up an e3-user with the keypair setup above.
  3. Ensure the SSH daemon is running on each node serving the Confluence cluster instances(s)
  4. Enable JMX monitoring for your Confluence instance.
  5. Configure your Confluence instance so that secure administrator sessions (WebSudo) are disabled, the remote API is enabled, and the onboarding plugin is disabled. Optionally, set up a mail server.
  6. Set up test users on your Confluence instance by running ./e3/util/AddUsers.py from the PTF machine. You'll need to pass in the Confluence base URL and the max number of clients you intend to spin up during the experiment, as well as authentication credentials with the appropriate user creation privileges. Make sure your license can support the number of test users to be created.

    Example usage: bash ./e3/util/AddUsers/py --product Confluence --url http://localhost:8090/confluence --users 100 7. Install the collectd daemon (version >= 5.2) and configure it with the example confluence-collectd.conf configuration file contained within this repository. You may need to make some adjustments depending on your Linux distribution / collectd installation. See the dedicated README for more details. Make sure the daemon is running when the test begins. 8. Write a json file describing your Confluence instance containing the following fields: List of ClusterNodes connection strings A URL for Confluence server 9. Copy both the json file describing the instance and the private key file for the instance to your e3-home/instances directory. Both files should have identical names with different extensions .json and .pem respectively.

Example two node Confluence configuration file.

{
  "ClusterNodes": [
    "e3-user@confluence-server-node-1",
    "e3-user@confluence-server-node-2"
  ],
  "URL": "http://confluence-server"
}

Configuring worker nodes

This section describes the steps needed to set up a worker cluster that will execute load against your product instance while also collecting metrics that will be useful in understanding your experiment.

Create a public/private key pair that will be used to authenticate against each node in your cluster. The same key pair must be used across all worker nodes.

For each node in your worker cluster

  1. Install the following
    • git
    • java
    • python 2.7
    • collectd (version >= 5.2)
  2. Setup collectd using the worker-collectd.conf configuration file contained within this repository as a template. You may need to make some adjustments depending on your Linux distribution / collectd installation. See the dedicated README for more details. Make sure the daemon is running when the test begins.
  3. Ensure the SSH daemon is running on the worker node.
  4. Create an e3-user with sudo permissions that can be authenticated using the public/private key pair you created for this cluster. There should not be a password prompt when executing sudo commands. You may have to edit your /etc/sudoers config to include something like the following:
e3-user ALL=(ALL) NOPASSWD:ALL
  1. Create a /media/data directory with the e3-user as an owner
  2. Write a json file describing your worker cluster.
  3. Copy the json file that describes the worker cluster and the private key file for the instance into your e3-home/instances directory. Both files should have identical names with different extensions .json and .pem respectively.

Example worker cluster configuration file.

{
  "ClusterNodes": [
    "e3-user@worker-node-1",
    "e3-user@worker-node-2",
    "e3-user@worker-node-3",
    "e3-user@worker-node-4"
  ]
}

Example experiment file

Describe your experiment making reference to the infrastructure configuration files you created above. Because you are not intending to provision your instances with the Performance Testing Framework, you may not need all the information described in the Experiments above. Below is an example experiment file containing the minimum amount of information needed to run experiments on your own infrastructure.

{
  "product": "Bitbucket",
  "duration": 10000,
  "workload": "bitbucket-mixed",
  "threads": [
    {
      "name": "2-node",
      "instance": {
        "stack": {
          "Name": "two-node-cluster"
        }
      },
      "stages": [
        {
          "clients": 40
        }
      ],
      "worker": {
        "stack": {
          "Name": "four-node-worker"
        }
      }
    }
  ]
}

The Name field shown above refers to the names of the json files containing your cluster and worker configurations.

Pulling it all together with a fully worked example

This example shows how to run an experiment comparing a two node and four node Bitbucket cluster under increasing load levels. The example should be easily applied to Confluence as well, except the snapshot and RunConfig information can be omitted.

First define the product instance configuration files.

two-node.json

{
  "ClusterNodes": [
    "e3-user@bitbucket-server-node-1",
    "e3-user@bitbucket-server-node-2"
  ],
  "URL": "http://bitbucket-server-with-2-nodes",
  "snapshot": "my-snapshot",
  "RunConfig": {
    "admin_password": "s3cr3t"
  }
}

four-node.json

{
  "ClusterNodes": [
    "e3-user@bitbucket-server-node-1",
    "e3-user@bitbucket-server-node-2",
    "e3-user@bitbucket-server-node-3",
    "e3-user@bitbucket-server-node-4"
  ],
  "URL": "http://bitbucket-server-with-4-nodes",
  "snapshot": "my-snapshot",
  "RunConfig": {
    "admin_password": "s3cr3t"
  }
}

Next define the worker nodes.

two-node-worker.json

{
  "ClusterNodes": [
    "e3-user@worker-node-1",
    "e3-user@worker-node-2",
    "e3-user@worker-node-3",
    "e3-user@worker-node-4"
  ]
}

and the identical four-node-worker.json

copy these four files along with their respective private ssh keys to ./e3-home/instances so that the directory now contains.

  • two-node.json
  • two-node.pem
  • four-node.json
  • four-node.pem
  • two-node-worker.json
  • two-node-worker.pem
  • four-node-worker.json
  • four-node-worker.pem

Define your experiment in a file named my-experiment.json. In this case, we go from 40 to 200 concurrent connections in 5 stages, each lasting 5 minutes and executing a bitbucket-mixed workload.

{
  "product": "Bitbucket",
  "duration": 300000,
  "workload": "bitbucket-mixed",
  "threads": [
    {
      "name": "2-node",
      "instance": {
        "stack": {
          "Name": "two-node"
        }
      },
      "stages": [
        { "clients": 40 },
        { "clients": 80 },
        { "clients": 120 },
        { "clients": 160 },
        { "clients": 200 }
      ],
      "worker": {
        "stack": {
          "Name": "two-node-worker"
        }
      }
    },
    {
      "name": "4-node",
      "instance": {
        "stack": {
          "Name": "four-node"
        }
      },
      "stages": [
        { "clients": 40 },
        { "clients": 80 },
        { "clients": 120 },
        { "clients": 160 },
        { "clients": 200 }
      ],
      "worker": {
        "stack": {
          "Name": "four-node-worker"
        }
      }
    }
  ]

}

You will need to create a new folder with the name of your experiment, in this case my-experiment, in the ./e3-home/runs directory. If the ./e3-home/runs directory does not exist, create it. Copy your experiment file to ./e3-home/runs/my-experiment/.

Running your experiment

Before running an experiment, make sure you have followed the Setup section instructions to install the proper prerequisites.

If you are testing Bitbucket, you can now simply run this experiment by going to the e3 directory and running

./Run.py -r my-experiment

If you are testing Confluence, you will need to pass in your admin user credentials to allow the Performance Testing Framework to reset Confluence between stages. Go to the e3 directory and run

./Run.py -r my-experiment -u <admin_username> -p <admin_password>

Analyzing your experiment data

Once the experiment has completed its run, you'll want to look at the data gathered by the Performance Testing Framework. Gathering and analyzing the data from your instance and worker nodes is as simple as running the following commands from the e3 directory:

./Gather -r my-experiment
./Analyze -r my-experiment

The data files and resulting graphs will be found in the experiment run folder created earlier (i.e. ./e3-home/runs/my-experiment).


Contributing

Pull requests, issues and comments welcome. For pull requests:

  • Follow the existing style
  • Separate unrelated changes into multiple pull requests

See the existing issues for things to start contributing.

For bigger changes, make sure you start a discussion first by creating an issue and explaining the intended change.

Atlassian requires contributors to sign a Contributor License Agreement, known as a CLA. This serves as a record stating that the contributor is entitled to contribute the code/documentation/translation to the project and is willing to have it used in distributions and derivative works (or is willing to transfer ownership).

Prior to merging your pull requests, please follow the appropriate link below to digitally sign the CLA. The Corporate CLA is for those who are contributing as a member of an organization and the individual CLA is for those contributing as an individual.

Support

The Performance Testing Framework is provided as is, and is not covered by Atlassian support. Our support team won’t be able to assist you with interpreting the test results.

Head to the performance-testing-framework topic on Atlassian Community to see how others have been using the framework, share your experiences, and ask for help from the community.

License

Copyright (c) 2016 Atlassian and others. Apache 2.0 licensed, see LICENSE.txt file.