Managing Clusters

The following section describes how to perform common cluster operations: automatically or manually scale-up/scale-down, clone, and terminate.

Cloning

To clone any previously created cluster:

  1. Browse to the cluster details.

  2. Click CLUSTER ACTIONS and select Clone.

  3. Confirm the operation. The create cluster form is displayed with all of the default settings that match the current cluster configuration.

  4. Click CREATE CLUSTER to start the cluster launch process.

Tip

Instead of cloning a cluster multiple times, you can save it as a template. Refer to Managing Cluster Templates.

Resizing

To resize a cluster:

  1. Browse to the cluster details.

  2. Click CLUSTER ACTIONS and select Resize. The cluster resize dialog is displayed.

  3. Using the +/- control, you can adjust how many worker and compute nodes to add or remove from the cluster. For information on scaling compute nodes that use spot instances, refer to Spot Instances.

    You cannot scale a cluster below 3 worker nodes.
    You cannot scale a cluster below 1 compute node. If you create a cluster with at least 1 compute node, you cannot scale down to 0 compute nodes.
    You can add or remove worker and compute nodes, but not at once.

  4. Click RESIZE CLUSTER to initiate the scale-up/scale-down.

Auto Scaling

You can optionally enable cluster auto scaling. With auto scaling on, when the conditions that you specified are met, the cloud controller will add or remove worker or compute nodes.

You can enable cluster auto scaling either when creating a cluster, or later when the cluster is already running, using the option on the cluster details page:

  1. Browse to the cluster details.

  2. Click CLUSTER ACTIONS > Auto Scaling. Select Add New Auto Scaling Plicy to add Auto scaling policies and configure auto scaling, and

The AUTO SCALING tab that allows you to manage the auto scaling settings, and add or remove scaling policies.

Refer to Auto Scaling for more information.

Repairing Failed Nodes

The cloud controller monitors clusters, ensuring that any host-level failures that occur are quickly resolved by deleting or replacing failed nodes. This is called node repair.

Depending on the auto repair option that you selected when creating a cluster, the cloud controller will either automatically repair and replace a node that has failed, or it will notify you about the failure and let you decide what action to take.

Refer to Node Auto Repair for more information.

Stopping and Restarting

HDCloud for AWS does not support stopping and restarting clusters. When you do not need your cluster, offload your data (refer to Copying Data Between a Cluster and Amazon S3), terminate the cluster, and create a new cluster when needed.

Terminating

Important

When the cluster is terminated, all cluster instances will be destroyed. To retain any HDFS data, you must offload your data from HDFS prior to terminating a cluster.

  1. Browse to the cluster details.

  2. Click CLUSTER ACTIONS and select TERMINATE.

  3. Acknowledge that you are aware that terminating the cluster will destroy all cluster instances.

  4. Click YES, TERMINATE CLUSTER to start the cluster termination process.