Using the Command Line Interface

This section will help you get started with the HDCloud CLI.

Getting Help

To get CLI help, you can add help to the end of a command. The following will list help for the CLI at the top-level:

hdc help

The following will list help for the create-cluster command, including its command options and global options:

hdc create-cluster help

Command Structure

The CLI command structure can contain different parts. The first part is a set of global options. The next part is the command. The next part is a set of command options and arguments which could include sub-commands.

hdc [global options] command [command options] [arguments...]

Command Output

You can control the output from the CLI using the --output argument. The possible output formats include:

hdc list-clusters --output json

[ 
  {
    "ClusterName": "mytestcluster",
    "HDPVersion": "2.5",
    "ClusterType": "EDW-ETL: Apache Spark 2.0, Apache Hive 2",
    "Status": "AVAILABLE"
  }
]
hdc list-clusters --output table

+---------------+-----------+-------------+----------------------------------------------+--------------+
| CLUSTER NAME  |  STATUS   | HDP VERSION |                 CLUSTER TYPE                 | NODES STATUS |
+---------------+-----------+-------------+----------------------------------------------+--------------+
| test-cluster  | AVAILABLE |         2.5 | EDW-ETL: Apache Hive 1.2.1, Apache Spark 2.0 | HEALTHY      |
+---------------+-----------+-------------+----------------------------------------------+--------------+

Useful Commands

This section includes CLI commands that you may find useful.

Generating CLI Skeleton

The create-cluster command includes an option to generate-cli-skeleton. This option generates JSON output that includes all the options for the command. For example:

hdc create-cluster generate-cli-skeleton

{
  "ClusterName": "",
  "HDPVersion": "2.5",
  "ClusterType": "Data Science: Apache Spark 1.6, Apache Zeppelin 0.7.0",
  "Master": {
    "InstanceType": "m4.4xlarge",
    "VolumeType": "gp2",
    "VolumeSize": 32,
    "VolumeCount": 1,
    "Encrypted": false,
    "InstanceCount": 1,
    "Recipes": []
  },
  "Worker": {
    "InstanceType": "m3.xlarge",
    "VolumeType": "ephemeral",
    "VolumeSize": 40,
    "VolumeCount": 2,
    "Encrypted": false,
    "InstanceCount": 3,
    "Recipes": [],
    "RecoveryMode": "AUTO"
  },
  "Compute": {
    "InstanceType": "m3.xlarge",
    "VolumeType": "ephemeral",
    "VolumeSize": 40,
    "VolumeCount": 1,
    "Encrypted": false,
    "InstanceCount": 0,
    "Recipes": [],
    "RecoveryMode": "AUTO",
    "SpotPrice": "0"
  },
  "SSHKeyName": "",
  "RemoteAccess": "",
  "WebAccess": true,
  "HiveJDBCAccess": true,
  "ClusterComponentAccess": false,
  "ClusterAndAmbariUser": "",
  "ClusterAndAmbariPassword": "",
  "InstanceRole": "CREATE",
  "Network": {
    "VpcId": "",
    "SubnetId": ""
  },
  "Tags": {},
  "HiveMetastore": {
    "Name": "",
    "Username": "",
    "Password": "",
    "URL": "",
    "DatabaseType": ""
  },
  "DruidMetastore": {
    "Name": "",
    "Username": "",
    "Password": "",
    "URL": "",
    "DatabaseType": ""
  },
  "Configurations": []
  "Autoscaling": {                                           
    "Configurations": {                      
        "CooldownTime": 30,                        
        "ClusterMinSize": 3,                                        
        "ClusterMaxSize": 100     
    },
    "Policies": []
  }
}

Tips

  • To get more information about each option and see an example CLI skeleton, use hdc create-cluster generate-cli-skeleton --help.
  • You can use hdc describe-cluster -cluster-name name-of-an-existing-cluster to generate a CLI JSON skeleton for an existing cluster. Or you can obtain the CLI JSON skeleton from the cloud controller UI, either from a previously saved cluster template or when creating a cluster. For more information, refer to Managing Cluster Templates.
  • If you save this output in a JSON file and update it to specify your cluster parameters, you can then create a cluster using the hdc create-cluster command.

Setting Custom Properties

Add Configurations to the CLI skeleton to specify custom cluster configurations. For example:

"Configurations": [
    {
      "core-site": {
        "fs.trash.interval": "5000"
      }
    }
  ]

Adding Node Recipes

Add Recipes to the CLI skeleton to add custom scripts:

"Recipes": [
            {
                "URI": "http://some-site.com/test.sh",
                "Phase": "post"
            }
           ]

Accepted values for "Phase" are "pre" and "post".

Adding Custom Tags

Add Tags to the CLI skeleton to add custom tags for your cluster resources:

"Tags": {
        "tagkey1": "tagvalue1",
        "tagkey2": "tagvalue2"
        },

Creating a Cluster

If you save the output generated by the generate-cli-skeleton option in a JSON file and update it to specify your cluster parameters, you can create a cluster using:

hdc create-cluster --cli-input-json file.JSON

Where file.JSON is a JSON file describing your cluster.

Managing Clusters

List available cluster types

hdc list-cluster-types

List available clusters and check for node failures

hdc list-clusters

Describe an existing cluster

hdc describe-cluster --cluster-name cluster1234

Describe instances in an existing cluster

hdc describe-cluster instances --cluster-name cluster1234

Add one worker node to your cluster

hdc resize-cluster --node-type worker --cluster-name cluster1234 --scaling-adjustment 1

Remove one worker node from your cluster

hdc resize-cluster --node-type worker --cluster-name cluster1234 --scaling-adjustment -1

Add one compute node to your cluster

hdc resize-cluster --node-type compute --cluster-name cluster1234 --scaling-adjustment 1

Remove one compute node from your cluster

hdc resize-cluster --node-type compute --cluster-name cluster1234 --scaling-adjustment -1

Terminate your cluster

hdc terminate-cluster --cluster-name cluster1234

Monitoring and Repairing Your Cluster

Enable auto repair

When creating a cluster using the CLI skeleton you can enable node auto repair for worker or compute nodes by setting the RecoveryMode option to AUTO (which is the default setting). To disable node auto repair, set RecoveryMode to MANUAL.

Check the health of your clusters

hdc list-clusters --output table
+---------------+-----------+-------------+----------------------------------------------+--------------+
| CLUSTER NAME  |  STATUS   | HDP VERSION |                 CLUSTER TYPE                 | NODES STATUS |
+---------------+-----------+-------------+----------------------------------------------+--------------+
| test-cluster  | AVAILABLE |         2.5 | EDW-ETL: Apache Hive 1.2.1, Apache Spark 2.0 | HEALTHY      |
| test12356     | AVAILABLE |         2.5 | EDW-ETL: Apache Hive 1.2.1, Apache Spark 2.0 | UNHEALTHY    |
+---------------+-----------+-------------+----------------------------------------------+--------------+
The "NodeStatus" shows whether a cluster is HEALTHY or UNHEALTHY.

To identify which nodes in a cluster are unhealthy, use:

hdc describe-cluster instances --cluster-name test-cluster2 --output table
+---------------------+----------------------------+----------------+------------+-----------------+-------------+------------------------+
|     INSTANCE ID     |          HOSTNAME          |   PUBLIC IP    | PRIVATE IP | INSTANCE STATUS | HOST STATUS |          TYPE          |
+---------------------+----------------------------+----------------+------------+-----------------+-------------+------------------------+
| i-02db73c97e20e2117 | ip-10-0-3-241.ec2.internal | 34.198.195.20  | 10.0.3.241 | REGISTERED      | HEALTHY     | master - ambari server |
| i-003289e28c98d9902 | ip-10-0-3-41.ec2.internal  | 54.175.188.252 | 10.0.3.41  | REGISTERED      | HEALTHY     | worker                 |
| i-005ff5e20d16fcfaa | ip-10-0-3-25.ec2.internal  | 54.174.90.99   | 10.0.3.25  | REGISTERED      | HEALTHY     | worker                 |
| i-0360f27ea0441c220 | ip-10-0-3-113.ec2.internal | 54.82.247.15   | 10.0.3.113 | REGISTERED      | UNHEALTHY   | worker                 |
+---------------------+----------------------------+----------------+------------+-----------------+-------------+------------------------+

Repair your cluster by replacing failed nodes

To repair an unhealthy worker node, use:

hdc repair-cluster --cluster-name test-cluster2 --node-type worker

To repair an unhealthy compute node, use:

hdc repair-cluster --cluster-name test-cluster2 --node-type compute

Remove failed nodes

To remove a failed node (without replacing it), use:

hdc repair-cluster --cluster-name cluster1234 --node-type worker --remove-only true

Configuring Auto Scaling for Your Cluster

Enable auto scaling when creating a cluster

To enable auto scaling when creating a cluster, add the following to the CLI skeleton:

"Autoscaling": {
    "Configurations": {
      "CooldownTime": 10,
      "ClusterMinSize": 1,
      "ClusterMaxSize": 40
    },
    "Policies": [
      {
        "PolicyName": "myscalingpolicy1",
        "ScalingAdjustment": 1,
        "ScalingDefinition": "cpu_threshold_exceeded",
        "Operator": ">",
        "Threshold": 50,
        "Period": 1,
        "NodeType": "worker"
      }
    ]
  },

While the Configurations section is mandatory, you don't have to specify any policies when creating a cluster.

Enable auto scaling for an existing cluster

To enable autoscaling for an existing cluster, use the enable-autoscaling command. For example:

hdc enable-autoscaling --cluster-name mytestcluster1234

Disable auto scaling

hdc disable-autoscaling --cluster-name mytestcluster1234

List auto scaling definitions

hdc list-autoscaling-definitions
[
  {
    "Name": "cpu_threshold_exceeded",
    "Label": "CPU usage"
  },
  {
    "Name": "memory_threshold_exceeded",
    "Label": "Memory usage"
  },
  {
    "Name": "namenode_capacity_threshold_exceeded",
    "Label": "HDFS usage"
  },
  {
    "Name": "yarn_root_queue_memory",
    "Label": "Cluster capacity"
  }
]

Add auto scaling policy

hdc add-autoscaling-policy --cluster-name mytestcluster1234 --policy-name mytestpolicy --scaling-adjustment -1 --scaling-definition cpu_threshold_exceeded --operator "<" --threshold 5 --period 10 --node-type worker

Use hdc list-autoscaling-definitions to get the list of valid auto scaling metrics.

Remove auto scaling policy

hdc remove-autoscaling-policy --cluster-name mytestcluster1234 --policy-name mytestpolicy

Configure auto scaling settings

Configure the cooldown time and min/max cluster size for a given cluster:

hdc configure-autoscaling --cluster-name mytestcluster1234 --cooldown-time 30 --min-cluster-size 5 --max-cluster-size 50

Managing Metastores

List metastores

hdc list-metastores
[ 
  {
    "Name": "testmeta",
    "Username": "dbialek",
    "Password": "",
    "URL": "rds4hive.crjdujkaybgv.us-east-1.rds.amazonaws.com:5432/Hive",
    "DatabaseType": "POSTGRES",
    "HDPVersion": "2.5"
  }
]

Register Hive metastore

You can register an existing RDS instance as a Hive metastore using the following command:

hdc register-metastore --rds-name domi-rds --rds-username dbialek --rds-password My1Secret2Password3 --rds-url domi-rds.crjdujkaybgv.us-east-1.rds.amazonaws.com:5432/testdb --rds-type hive

Alternatively, you can register a Hive metastore when creating a cluster using the "HiveMetastore" field in the CLI skeleton.

If you have previously registered a metastore with the cloud controller (by either using the hdc register-metastore command or its UI equivalent), only provide the name of the metastore:

 "HiveMetastore": {
    "Name": "testmeta",
  }

If you have not previously registered a metastore with the cloud controller, provide all connection information:

"HiveMetastore": {
    "Name": "testmeta",
    "Username": "dbialek",
    "Password": "",
    "URL": "rds4hive.crjdujkaybgv.us-east-1.rds.amazonaws.com:5432/Hive",
    "DatabaseType": "POSTGRES",
    "HDPVersion": "2.5"
  }

Register Druid metastore

You can register an existing RDS instance as a Druid metastore using the following command:

hdc register-metastore --rds-name domi-rds2 --rds-username dbialek --rds-password My1Secret2Password3 --rds-url domi-rds.crjdujkaybgv.us-east-1.rds.amazonaws.com:5432/testdb2 --rds-type druid

Or you can use the the "DruidMetastore" field in the CLI skeleton.