Wiki

Clone wiki

sparkseq / WindowsAzureSetup

Windows Azure Setup

Introduciton

In azure_scripts folder in sparseq repository you can find a couple shell scripts for managing of SparkSeq cluster in Windows Azure

Prerequisites

In order to you the managing scritps you need to:

1. Have an Windows Azure account up and running.

2. Install azure-cli - a command-line, cross-platform Windows Azure managing tool.

3. Download and import yout publish settings - a quick guide can found here.

4. Generate Windows Azure Compatible Keys - more information here.

5. Windows Azure Virtual Network and Affinity Group configured( a quick guide can found here).

6. Install parallel-ssh package.

Quick start

1. Edit azureConfig.cfg - a configuration file for create SparkSeq cluster in Windows azure. The most important params are:

  • userName - an username used for password-less ssh connection
  • privKey - a path to your private key generated in point 4.(prerequisites section)
  • intNumber - a number of all instances to be created (i.e. Haadoop name nodes, Hadoop data nodes/Spark WorkerNodes,etc)
  • vmSize - a size of all the instances in Windows Azure
  • instPreffix - a preffix that together with 3-digits constitutes the instances name, e.g. sparkseq001,sparkseq002, etc. It has to be unique in Windows Azure.
  • location - whete to store your instances' drives (don't used in this version of the script-location is configure while creating Affinity Group)
  • cer - your certificate created in point 4 (prerequisites section)
  • vpnName - your virtual network created in point 5 (has to be unique as well)
  • agName - your Affinity Group created in point 5 (prerequisites section, unique as well)
userName=mesos
instList="instances.txt"
privKey="myPrivateKey.key"
instNumber=8
imageName=b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-12_04_3-LTS-amd64-server-20131205-en-us-30GB
vmSize=a6
instPreffix=sparkseq
location="West Europe"
endPoints="443:8888,80:80,4040:4040,7077:7077,8080:8080,50010:50010,50070:50070,50075:50075"
instanceList="instances.txt"
cert="myCert.pem"
vpnName=sparkseqvpn
agName=sparkseqag

2. Run deployment script:

./deployAzureCluster.sh

Next you can use Anisble playbooks to continue configuration of the instances.

To shutdown your cluster (with preserving states of all instances without deleting anything, just to release resources) you can run the script:

./shutdownAzureCluster.sh

and to start it again:

./startAzureCluster.sh

To destroy it completely (deleting all instances and their hard drives permanently):

./destroyAzureCluster.sh

Updated