1. Get a local copy of the tools through Git 1.1 make a directory somewhere in your machine. Go to that directory 1.2 pull donw the tools through Git by this command git clone ssh://USERNAME@linux.cs.duke.edu/usr/research/proj/git/cps216/harness.git [replace the USERNAME with your user name. You should see a directory named 'harness' after this] 2. Enivronment configuration You need to configure the environment before you start using the tool to manipulate AWS cluster. This is a one-time process. You don't need to do this afterwards. 2.1 Open your bash_profile at your home directory. Note that you can use any text editor to edit this file. What is shown here is what I did. vi ~/.bash_profile [if you are using linux.cs.duke.edu, you should edit ~/.my-bash_profile instead] 2.2 Press 'i' to enter editing mode. Copy the stuffs below at the end of your file #-------------------------------------------------------- export EC2_HOME=[PATH TO ec2-api-tools- directory] export JAVA_HOME=[FILL IN with path to Java 1.6+ home] export HADOOP_EC2_HOME=[FILL IN with path to harness/hadoop_ec2_contrib_bin] export AWS_HADOOP_HARNESS_HOME=[FILL IN with path to harness/aws_hadoop_harness] export PATH=${PATH}:${JAVA_HOME}/bin:${EC2_HOME}/bin:${HADOOP_EC2_HOME} export AWS_USER_ID=[FILL IN] export AWS_ACCESS_KEY_ID=[FILL IN] export AWS_SECRET_ACCESS_KEY=[FILL IN] export EC2_PRIVATE_KEY=[FILL IN with private key file path] export EC2_CERT=[FILL IN certificate file] #----------------------------------------------------- 2.3 You will see lots of statements looks like "export NAME=[VALUE]". You should replace the [VALUE] with something meaningful. Here is a mapping between the NAME and their [VALUE]. EC2_HOME: the path to ec2-api-tools- directory, which should be in the harness directory you get in step 1.2 JAVA_HOME: the path to JAVA. If you are in linux.cs.duke.edu, the value can be "/usr" HADOOP_EC2_HOME: path to the hadoop_ec2_contrib_bin directory which should be in harness directory you get in step 1.2 AWS_HADOOP_HARNESS_HOME: path to the aws_hadoop_harness directory which shoud be in harness directory you get in step 1.2 PATH: include all the paths you specified above. You don't need to change this statement. AWS_USER_ID: login to aws.amazon.com, go to "Security Credentials" page (see below how to get to this page) AWS_ACCESS_KEY_ID: still in the "Security Credentials" page (see below how to get to this page). In "Access Credentials" section, click "Access Keys" you should see you "Access Key ID". This is the value for AWS_USER_ID. If you don't have a key, click "Create a new Access Key" to generate one. AWS_SECRET_ACCESS_KEY: still in the "Security Credentials" page, "Access Credentials" section and "Access Keys" tag. Click "show", you will see the value for AWS_ACCESS_KEY_ID. EC2_PRIVATE_KEY: the path ot you private key file. How to get this file? In the "Security Credentials" page, "Access Credentials" section, click "X.509 Certificates" You should create a new certificate by clicking "Create a new Certificate", after which you will be required to save a "pk-XXX.pem" file. The path to this file is the value for EC2_PRIVATE_KEY. EC2_CERT: in the same place ("X.509 Certificates"), click download, you can download the CERT file. The path is the value for EC2_CERT [How to get to "Security Credentials" page? 1. go to aws.amazon.com, click the link at the top "Sign in to the AWS Management Console" and use you email/password to log in. 2. after you log in, click the link at the top left "Account". 3. click the link "Security Credentials"] 2.4 Last step in environment setting! Please go to the Please go to "Security Credentials" page, "Access Credentials" section and click "Key Pairs", you should get the key file by clicking "Download Public Key" (if you don't have a key yet, click "Create a New Key Pair" first) and save the file to some place. The file name should be something like "rsa-xxxx.pem". Go to harness/hadoop_ec2_contrib_bin, and open the file hadoop-ec2-env.sh by vi hadoop-ec2-env.sh press 'i' to enter editing mode. find the statement "KEY_NAME=my-keypair", replace "my-keypair" with the name of the file you just download (don't include the tail ".pem", but only "rsa-xxx"). Find the statement "PRIVATE_KEY_PATH=path_to_your_keypair_file", replace path_to_your_keypair_file with the path to the file you just download (this time you need to include the tail ".pem") Congratulation! You are totally done for the environmental setting. 3. Start using the tools to manipulate the AWS cluster. You should see a file "CHEATSHEET" in the harness directory. All the command you will be using and the explaination for each of them will be listed there.